* [PATCHV4 0/3] Machine check recovery when kernel accesses poison @ 2015-12-24 20:54 Tony Luck 2015-12-16 1:29 ` [PATCHV4 1/3] x86, ras: Add new infrastructure for machine check fixup tables Tony Luck ` (2 more replies) 0 siblings, 3 replies; 27+ messages in thread From: Tony Luck @ 2015-12-24 20:54 UTC (permalink / raw) To: Ingo Molnar Cc: Borislav Petkov, Andrew Morton, Andy Lutomirski, Dan Williams, elliott, linux-kernel, linux-mm, linux-nvdimm, x86 Ingo: I think I have fixed up everything to make all the people who commented happy. Do you have any further suggestions, or is this ready to go into the tip tree? This series is initially targeted at the folks doing filesystems on top of NVDIMMs. They really want to be able to return -EIO when there is a h/w error (just like spinning rust, and SSD does). I plan to use the same infrastructure in parts 1&2 to write a machine check aware "copy_from_user()" that will SIGBUS the calling application when a syscall touches poison in user space (just like we do when the application touches the poison itself). Changes V3-V4: Andy: Simplify fixup_mcexception() by dropping used-once local variable Andy: "Reviewed-by" tag added to part1 Boris: Moved new functions to memcpy_64.S and declaration to asm/string_64.h Boris: Changed name s/mcsafe_memcpy/__mcsafe_copy/ to make it clear that this is an internal function and that return value doesn't follow memcpy() semantics. Boris: "Reviewed-by" tag added to parts 1&2 Changes V2-V3: Andy: Don't hack "regs->ax = BIT(63) | addr;" in the machine check handler. Now have better fixup code that computes the number of remaining bytes (just like page-fault fixup). Andy: #define for BIT(63). Done, plus couple of extra macros using it. Boris: Don't clutter up generic code (like mm/extable.c) with this. I moved everything under arch/x86 (the asm-generic change is a more generic #define). Boris: Dependencies for CONFIG_MCE_KERNEL_RECOVERY are too generic. I made it a real menu item with default "n". Dan Williams will use "select MCE_KERNEL_RECOVERY" from his persistent filesystem code. Boris: Simplify conditionals in mce.c by moving tolerant/kill_it checks earlier, with a skip to end if they aren't set. Boris: Miscellaneous grammar/punctuation. Fixed. Boris: Don't leak spurious __start_mcextable symbols into kernels that didn't configure MCE_KERNEL_RECOVERY. Done. Tony: New code doesn't belong in user_copy_64.S/uaccess*.h. Moved to new .S/.h files Elliott:Cacheing behavior non-optimal. Could use movntdqa, vmovntdqa or vmovntdqa on source addresses. I didn't fix this yet. Think of the current mcsafe_memcpy() as the first of several functions. This one is useful for small copies (meta-data) where the overhead of saving SSE/AVX state isn't justified. Changes V1->V2: 0-day: Reported build errors and warnings on 32-bit systems. Fixed 0-day: Reported bloat to tinyconfig. Fixed Boris: Suggestions to use extra macros to reduce code duplication in _ASM_*EXTABLE. Done Boris: Re-write "tolerant==3" check to reduce indentation level. See below. Andy: Check IP is valid before searching kernel exception tables. Done. Andy: Explain use of BIT(63) on return value from mcsafe_memcpy(). Done (added decode macros). Andy: Untangle mess of code in tail of do_machine_check() to make it clear what is going on (e.g. that we only enter the ist_begin_non_atomic() if we were called from user code, not from kernel!). Done. Tony Luck (3): x86, ras: Add new infrastructure for machine check fixup tables x86, ras: Extend machine check recovery code to annotated ring0 areas x86, ras: Add __mcsafe_copy() function to recover from machine checks arch/x86/Kconfig | 10 +++ arch/x86/include/asm/asm.h | 10 ++- arch/x86/include/asm/mce.h | 14 ++++ arch/x86/include/asm/string_64.h | 8 ++ arch/x86/kernel/cpu/mcheck/mce-severity.c | 21 ++++- arch/x86/kernel/cpu/mcheck/mce.c | 86 +++++++++++-------- arch/x86/kernel/vmlinux.lds.S | 6 +- arch/x86/kernel/x8664_ksyms_64.c | 4 + arch/x86/lib/memcpy_64.S | 133 ++++++++++++++++++++++++++++++ arch/x86/mm/extable.c | 16 ++++ include/asm-generic/vmlinux.lds.h | 12 +-- 11 files changed, 276 insertions(+), 44 deletions(-) -- 2.1.4 ^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCHV4 1/3] x86, ras: Add new infrastructure for machine check fixup tables 2015-12-24 20:54 [PATCHV4 0/3] Machine check recovery when kernel accesses poison Tony Luck @ 2015-12-16 1:29 ` Tony Luck 2015-12-16 1:29 ` [PATCHV4 2/3] x86, ras: Extend machine check recovery code to annotated ring0 areas Tony Luck 2015-12-16 1:30 ` [PATCHV4 3/3] x86, ras: Add __mcsafe_copy() function to recover from machine checks Tony Luck 2 siblings, 0 replies; 27+ messages in thread From: Tony Luck @ 2015-12-16 1:29 UTC (permalink / raw) To: Ingo Molnar Cc: Borislav Petkov, Andrew Morton, Andy Lutomirski, Dan Williams, elliott, linux-kernel, linux-mm, linux-nvdimm, x86 Copy the existing page fault fixup mechanisms to create a new table to be used when fixing machine checks. Note: 1) At this time we only provide a macro to annotate assembly code 2) We assume all fixups will in code builtin to the kernel. 3) Only for x86_64 4) New code under CONFIG_MCE_KERNEL_RECOVERY (default 'n') Reviewed-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Borislav Petkov <bp@suse.de> Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/Kconfig | 10 ++++++++++ arch/x86/include/asm/asm.h | 10 ++++++++-- arch/x86/include/asm/mce.h | 14 ++++++++++++++ arch/x86/kernel/cpu/mcheck/mce.c | 16 ++++++++++++++++ arch/x86/kernel/vmlinux.lds.S | 6 +++++- arch/x86/mm/extable.c | 16 ++++++++++++++++ include/asm-generic/vmlinux.lds.h | 12 +++++++----- 7 files changed, 76 insertions(+), 8 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 96d058a87100..42d26b4d1ec4 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1001,6 +1001,16 @@ config X86_MCE_INJECT If you don't know what a machine check is and you don't do kernel QA it is safe to say n. +config MCE_KERNEL_RECOVERY + bool "Recovery from machine checks in special kernel memory copy functions" + default n + depends on X86_MCE && X86_64 + ---help--- + This option provides a new memory copy function mcsafe_memcpy() + that is annotated to allow the machine check handler to return + to an alternate code path to return an error to the caller instead + of crashing the system. Say yes if you have a driver that uses this. + config X86_THERMAL_VECTOR def_bool y depends on X86_MCE_INTEL diff --git a/arch/x86/include/asm/asm.h b/arch/x86/include/asm/asm.h index 189679aba703..a5d483ac11fa 100644 --- a/arch/x86/include/asm/asm.h +++ b/arch/x86/include/asm/asm.h @@ -44,13 +44,19 @@ /* Exception table entry */ #ifdef __ASSEMBLY__ -# define _ASM_EXTABLE(from,to) \ - .pushsection "__ex_table","a" ; \ +# define __ASM_EXTABLE(from, to, table) \ + .pushsection table, "a" ; \ .balign 8 ; \ .long (from) - . ; \ .long (to) - . ; \ .popsection +# define _ASM_EXTABLE(from, to) \ + __ASM_EXTABLE(from, to, "__ex_table") + +# define _ASM_MCEXTABLE(from, to) \ + __ASM_EXTABLE(from, to, "__mcex_table") + # define _ASM_EXTABLE_EX(from,to) \ .pushsection "__ex_table","a" ; \ .balign 8 ; \ diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h index 2dbc0bf2b9f3..9dc11d2a9db1 100644 --- a/arch/x86/include/asm/mce.h +++ b/arch/x86/include/asm/mce.h @@ -279,4 +279,18 @@ struct cper_sec_mem_err; extern void apei_mce_report_mem_error(int corrected, struct cper_sec_mem_err *mem_err); +#ifdef CONFIG_MCE_KERNEL_RECOVERY +const struct exception_table_entry *search_mcexception_tables(unsigned long a); +int fixup_mcexception(struct pt_regs *regs); +#else +static inline const struct exception_table_entry *search_mcexception_tables(unsigned long a) +{ + return 0; +} +static inline int fixup_mcexception(struct pt_regs *regs) +{ + return 0; +} +#endif + #endif /* _ASM_X86_MCE_H */ diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c index 9d014b82a124..0111cd49ee94 100644 --- a/arch/x86/kernel/cpu/mcheck/mce.c +++ b/arch/x86/kernel/cpu/mcheck/mce.c @@ -31,6 +31,7 @@ #include <linux/types.h> #include <linux/slab.h> #include <linux/init.h> +#include <linux/module.h> #include <linux/kmod.h> #include <linux/poll.h> #include <linux/nmi.h> @@ -2022,8 +2023,23 @@ static int __init mcheck_enable(char *str) } __setup("mce", mcheck_enable); +#ifdef CONFIG_MCE_KERNEL_RECOVERY +extern struct exception_table_entry __start___mcex_table[]; +extern struct exception_table_entry __stop___mcex_table[]; + +/* Given an address, look for it in the machine check exception tables. */ +const struct exception_table_entry *search_mcexception_tables(unsigned long addr) +{ + return search_extable(__start___mcex_table, __stop___mcex_table-1, addr); +} +#endif + int __init mcheck_init(void) { +#ifdef CONFIG_MCE_KERNEL_RECOVERY + if (__stop___mcex_table > __start___mcex_table) + sort_extable(__start___mcex_table, __stop___mcex_table); +#endif mcheck_intel_therm_init(); mce_register_decode_chain(&mce_srao_nb); mcheck_vendor_init_severity(); diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S index 74e4bf11f562..a65fa0deda06 100644 --- a/arch/x86/kernel/vmlinux.lds.S +++ b/arch/x86/kernel/vmlinux.lds.S @@ -110,7 +110,11 @@ SECTIONS NOTES :text :note - EXCEPTION_TABLE(16) :text = 0x9090 + EXCEPTION_TABLE(16) +#ifdef CONFIG_MCE_KERNEL_RECOVERY + NAMED_EXCEPTION_TABLE(16, mcex) +#endif + :text = 0x9090 #if defined(CONFIG_DEBUG_RODATA) /* .text should occupy whole number of pages */ diff --git a/arch/x86/mm/extable.c b/arch/x86/mm/extable.c index 903ec1e9c326..4ee867e30c66 100644 --- a/arch/x86/mm/extable.c +++ b/arch/x86/mm/extable.c @@ -2,6 +2,7 @@ #include <linux/spinlock.h> #include <linux/sort.h> #include <asm/uaccess.h> +#include <asm/mce.h> static inline unsigned long ex_insn_addr(const struct exception_table_entry *x) @@ -49,6 +50,21 @@ int fixup_exception(struct pt_regs *regs) return 0; } +#ifdef CONFIG_MCE_KERNEL_RECOVERY +int fixup_mcexception(struct pt_regs *regs) +{ + const struct exception_table_entry *fixup; + + fixup = search_mcexception_tables(regs->ip); + if (fixup) { + regs->ip = ex_fixup_addr(fixup); + return 1; + } + + return 0; +} +#endif + /* Restricted version used during very early boot */ int __init early_fixup_exception(unsigned long *ip) { diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index 1781e54ea6d3..42ef98de373a 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -467,14 +467,16 @@ /* * Exception table */ -#define EXCEPTION_TABLE(align) \ +#define NAMED_EXCEPTION_TABLE(align, pfx) \ . = ALIGN(align); \ - __ex_table : AT(ADDR(__ex_table) - LOAD_OFFSET) { \ - VMLINUX_SYMBOL(__start___ex_table) = .; \ - *(__ex_table) \ - VMLINUX_SYMBOL(__stop___ex_table) = .; \ + __##pfx##_table : AT(ADDR(__##pfx##_table) - LOAD_OFFSET) { \ + VMLINUX_SYMBOL(__start___##pfx##_table) = .; \ + *(__##pfx##_table) \ + VMLINUX_SYMBOL(__stop___##pfx##_table) = .; \ } +#define EXCEPTION_TABLE(align) NAMED_EXCEPTION_TABLE(align, ex) + /* * Init task */ -- 2.1.4 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCHV4 2/3] x86, ras: Extend machine check recovery code to annotated ring0 areas 2015-12-24 20:54 [PATCHV4 0/3] Machine check recovery when kernel accesses poison Tony Luck 2015-12-16 1:29 ` [PATCHV4 1/3] x86, ras: Add new infrastructure for machine check fixup tables Tony Luck @ 2015-12-16 1:29 ` Tony Luck 2015-12-16 1:30 ` [PATCHV4 3/3] x86, ras: Add __mcsafe_copy() function to recover from machine checks Tony Luck 2 siblings, 0 replies; 27+ messages in thread From: Tony Luck @ 2015-12-16 1:29 UTC (permalink / raw) To: Ingo Molnar Cc: Borislav Petkov, Andrew Morton, Andy Lutomirski, Dan Williams, elliott, linux-kernel, linux-mm, linux-nvdimm, x86 Extend the severity checking code to add a new context IN_KERN_RECOV which is used to indicate that the machine check was triggered by code in the kernel with a fixup entry. Add code to check for this situation and respond by altering the return IP to the fixup address. Major re-work to the tail code in do_machine_check() to make all this readable/maintainable. One functional change is that tolerant=3 no longer stops recovery actions. Revert to only skipping sending SIGBUS to the current process. Reviewed-by: Borislav Petkov <bp@suse.de> Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/kernel/cpu/mcheck/mce-severity.c | 21 +++++++++- arch/x86/kernel/cpu/mcheck/mce.c | 70 ++++++++++++++++--------------- 2 files changed, 55 insertions(+), 36 deletions(-) diff --git a/arch/x86/kernel/cpu/mcheck/mce-severity.c b/arch/x86/kernel/cpu/mcheck/mce-severity.c index 9c682c222071..cc7136351820 100644 --- a/arch/x86/kernel/cpu/mcheck/mce-severity.c +++ b/arch/x86/kernel/cpu/mcheck/mce-severity.c @@ -29,7 +29,7 @@ * panic situations) */ -enum context { IN_KERNEL = 1, IN_USER = 2 }; +enum context { IN_KERNEL = 1, IN_USER = 2, IN_KERNEL_RECOV = 3 }; enum ser { SER_REQUIRED = 1, NO_SER = 2 }; enum exception { EXCP_CONTEXT = 1, NO_EXCP = 2 }; @@ -48,6 +48,7 @@ static struct severity { #define MCESEV(s, m, c...) { .sev = MCE_ ## s ## _SEVERITY, .msg = m, ## c } #define KERNEL .context = IN_KERNEL #define USER .context = IN_USER +#define KERNEL_RECOV .context = IN_KERNEL_RECOV #define SER .ser = SER_REQUIRED #define NOSER .ser = NO_SER #define EXCP .excp = EXCP_CONTEXT @@ -87,6 +88,10 @@ static struct severity { EXCP, KERNEL, MCGMASK(MCG_STATUS_RIPV, 0) ), MCESEV( + PANIC, "In kernel and no restart IP", + EXCP, KERNEL_RECOV, MCGMASK(MCG_STATUS_RIPV, 0) + ), + MCESEV( DEFERRED, "Deferred error", NOSER, MASK(MCI_STATUS_UC|MCI_STATUS_DEFERRED|MCI_STATUS_POISON, MCI_STATUS_DEFERRED) ), @@ -123,6 +128,11 @@ static struct severity { MCGMASK(MCG_STATUS_RIPV|MCG_STATUS_EIPV, MCG_STATUS_RIPV) ), MCESEV( + AR, "Action required: data load in error recoverable area of kernel", + SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR|MCACOD_DATA), + KERNEL_RECOV + ), + MCESEV( AR, "Action required: data load error in a user process", SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR|MCACOD_DATA), USER @@ -170,6 +180,9 @@ static struct severity { ) /* always matches. keep at end */ }; +#define mc_recoverable(mcg) (((mcg) & (MCG_STATUS_RIPV|MCG_STATUS_EIPV)) == \ + (MCG_STATUS_RIPV|MCG_STATUS_EIPV)) + /* * If mcgstatus indicated that ip/cs on the stack were * no good, then "m->cs" will be zero and we will have @@ -183,7 +196,11 @@ static struct severity { */ static int error_context(struct mce *m) { - return ((m->cs & 3) == 3) ? IN_USER : IN_KERNEL; + if ((m->cs & 3) == 3) + return IN_USER; + if (mc_recoverable(m->mcgstatus) && search_mcexception_tables(m->ip)) + return IN_KERNEL_RECOV; + return IN_KERNEL; } /* diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c index 0111cd49ee94..e848b583e840 100644 --- a/arch/x86/kernel/cpu/mcheck/mce.c +++ b/arch/x86/kernel/cpu/mcheck/mce.c @@ -959,6 +959,20 @@ static void mce_clear_state(unsigned long *toclear) } } +static int do_memory_failure(struct mce *m) +{ + int flags = MF_ACTION_REQUIRED; + int ret; + + pr_err("Uncorrected hardware memory error in user-access at %llx", m->addr); + if (!(m->mcgstatus & MCG_STATUS_RIPV)) + flags |= MF_MUST_KILL; + ret = memory_failure(m->addr >> PAGE_SHIFT, MCE_VECTOR, flags); + if (ret) + pr_err("Memory error not recovered"); + return ret; +} + /* * The actual machine check handler. This only handles real * exceptions when something got corrupted coming in through int 18. @@ -996,8 +1010,6 @@ void do_machine_check(struct pt_regs *regs, long error_code) DECLARE_BITMAP(toclear, MAX_NR_BANKS); DECLARE_BITMAP(valid_banks, MAX_NR_BANKS); char *msg = "Unknown"; - u64 recover_paddr = ~0ull; - int flags = MF_ACTION_REQUIRED; int lmce = 0; ist_enter(regs); @@ -1124,22 +1136,13 @@ void do_machine_check(struct pt_regs *regs, long error_code) } /* - * At insane "tolerant" levels we take no action. Otherwise - * we only die if we have no other choice. For less serious - * issues we try to recover, or limit damage to the current - * process. + * If tolerant is at an insane level we drop requests to kill + * processes and continue even when there is no way out. */ - if (cfg->tolerant < 3) { - if (no_way_out) - mce_panic("Fatal machine check on current CPU", &m, msg); - if (worst == MCE_AR_SEVERITY) { - recover_paddr = m.addr; - if (!(m.mcgstatus & MCG_STATUS_RIPV)) - flags |= MF_MUST_KILL; - } else if (kill_it) { - force_sig(SIGBUS, current); - } - } + if (cfg->tolerant == 3) + kill_it = 0; + else if (no_way_out) + mce_panic("Fatal machine check on current CPU", &m, msg); if (worst > 0) mce_report_event(regs); @@ -1147,25 +1150,24 @@ void do_machine_check(struct pt_regs *regs, long error_code) out: sync_core(); - if (recover_paddr == ~0ull) - goto done; + if (worst != MCE_AR_SEVERITY && !kill_it) + goto out_ist; - pr_err("Uncorrected hardware memory error in user-access at %llx", - recover_paddr); - /* - * We must call memory_failure() here even if the current process is - * doomed. We still need to mark the page as poisoned and alert any - * other users of the page. - */ - ist_begin_non_atomic(regs); - local_irq_enable(); - if (memory_failure(recover_paddr >> PAGE_SHIFT, MCE_VECTOR, flags) < 0) { - pr_err("Memory error not recovered"); - force_sig(SIGBUS, current); + /* Fault was in user mode and we need to take some action */ + if ((m.cs & 3) == 3) { + ist_begin_non_atomic(regs); + local_irq_enable(); + + if (kill_it || do_memory_failure(&m)) + force_sig(SIGBUS, current); + local_irq_disable(); + ist_end_non_atomic(); + } else { + if (!fixup_mcexception(regs)) + mce_panic("Failed kernel mode recovery", &m, NULL); } - local_irq_disable(); - ist_end_non_atomic(); -done: + +out_ist: ist_exit(regs); } EXPORT_SYMBOL_GPL(do_machine_check); -- 2.1.4 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCHV4 3/3] x86, ras: Add __mcsafe_copy() function to recover from machine checks 2015-12-24 20:54 [PATCHV4 0/3] Machine check recovery when kernel accesses poison Tony Luck 2015-12-16 1:29 ` [PATCHV4 1/3] x86, ras: Add new infrastructure for machine check fixup tables Tony Luck 2015-12-16 1:29 ` [PATCHV4 2/3] x86, ras: Extend machine check recovery code to annotated ring0 areas Tony Luck @ 2015-12-16 1:30 ` Tony Luck 2015-12-24 21:46 ` Borislav Petkov 2 siblings, 1 reply; 27+ messages in thread From: Tony Luck @ 2015-12-16 1:30 UTC (permalink / raw) To: Ingo Molnar Cc: Borislav Petkov, Andrew Morton, Andy Lutomirski, Dan Williams, elliott, linux-kernel, linux-mm, linux-nvdimm, x86 Using __copy_user_nocache() as inspiration create a memory copy routine for use by kernel code with annotations to allow for recovery from machine checks. Notes: 1) We align the source address rather than the destination. This means we never have to deal with a memory read that spans two cache lines ... so we can provide a precise indication of where the error occurred without having to re-execute at a byte-by-byte level to find the exact spot like the original did. 2) We 'or' BIT(63) into the return because this is the first in a series of machine check safe functions. Some will copy from user addresses, so may need to indicate an invalid user address instead of a machine check. 3) This code doesn't play any cache games. Future functions can use non-temporal loads/stores to meet needs of different callers. 4) Provide helpful macros to decode the return value. Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/include/asm/string_64.h | 8 +++ arch/x86/kernel/x8664_ksyms_64.c | 4 ++ arch/x86/lib/memcpy_64.S | 133 +++++++++++++++++++++++++++++++++++++++ 3 files changed, 145 insertions(+) diff --git a/arch/x86/include/asm/string_64.h b/arch/x86/include/asm/string_64.h index ff8b9a17dc4b..4359ebb86b86 100644 --- a/arch/x86/include/asm/string_64.h +++ b/arch/x86/include/asm/string_64.h @@ -78,6 +78,14 @@ int strcmp(const char *cs, const char *ct); #define memset(s, c, n) __memset(s, c, n) #endif +#ifdef CONFIG_MCE_KERNEL_RECOVERY +u64 __mcsafe_copy(void *dst, const void __user *src, unsigned size); + +#define COPY_MCHECK_ERRBIT BIT(63) +#define COPY_HAD_MCHECK(ret) ((ret) & COPY_MCHECK_ERRBIT) +#define COPY_MCHECK_REMAIN(ret) ((ret) & ~COPY_MCHECK_ERRBIT) +#endif + #endif /* __KERNEL__ */ #endif /* _ASM_X86_STRING_64_H */ diff --git a/arch/x86/kernel/x8664_ksyms_64.c b/arch/x86/kernel/x8664_ksyms_64.c index a0695be19864..3d42d0ef3333 100644 --- a/arch/x86/kernel/x8664_ksyms_64.c +++ b/arch/x86/kernel/x8664_ksyms_64.c @@ -37,6 +37,10 @@ EXPORT_SYMBOL(__copy_user_nocache); EXPORT_SYMBOL(_copy_from_user); EXPORT_SYMBOL(_copy_to_user); +#ifdef CONFIG_MCE_KERNEL_RECOVERY +EXPORT_SYMBOL(__mcsafe_copy); +#endif + EXPORT_SYMBOL(copy_page); EXPORT_SYMBOL(clear_page); diff --git a/arch/x86/lib/memcpy_64.S b/arch/x86/lib/memcpy_64.S index 16698bba87de..bef21b18f6f5 100644 --- a/arch/x86/lib/memcpy_64.S +++ b/arch/x86/lib/memcpy_64.S @@ -177,3 +177,136 @@ ENTRY(memcpy_orig) .Lend: retq ENDPROC(memcpy_orig) + +#ifdef CONFIG_MCE_KERNEL_RECOVERY +/* + * __mcsafe_copy - memory copy with machine check exception handling + * Note that we only catch machine checks when reading the source addresses. + * Writes to target are posted and don't generate machine checks. + */ +ENTRY(__mcsafe_copy) + cmpl $8,%edx + jb 20f /* less then 8 bytes, go to byte copy loop */ + + /* check for bad alignment of source */ + movl %esi,%ecx + andl $7,%ecx + jz 102f /* already aligned */ + subl $8,%ecx + negl %ecx + subl %ecx,%edx +0: movb (%rsi),%al + movb %al,(%rdi) + incq %rsi + incq %rdi + decl %ecx + jnz 0b +102: + movl %edx,%ecx + andl $63,%edx + shrl $6,%ecx + jz 17f +1: movq (%rsi),%r8 +2: movq 1*8(%rsi),%r9 +3: movq 2*8(%rsi),%r10 +4: movq 3*8(%rsi),%r11 + mov %r8,(%rdi) + mov %r9,1*8(%rdi) + mov %r10,2*8(%rdi) + mov %r11,3*8(%rdi) +9: movq 4*8(%rsi),%r8 +10: movq 5*8(%rsi),%r9 +11: movq 6*8(%rsi),%r10 +12: movq 7*8(%rsi),%r11 + mov %r8,4*8(%rdi) + mov %r9,5*8(%rdi) + mov %r10,6*8(%rdi) + mov %r11,7*8(%rdi) + leaq 64(%rsi),%rsi + leaq 64(%rdi),%rdi + decl %ecx + jnz 1b +17: movl %edx,%ecx + andl $7,%edx + shrl $3,%ecx + jz 20f +18: movq (%rsi),%r8 + mov %r8,(%rdi) + leaq 8(%rsi),%rsi + leaq 8(%rdi),%rdi + decl %ecx + jnz 18b +20: andl %edx,%edx + jz 23f + movl %edx,%ecx +21: movb (%rsi),%al + movb %al,(%rdi) + incq %rsi + incq %rdi + decl %ecx + jnz 21b +23: xorl %eax,%eax + sfence + ret + + .section .fixup,"ax" +30: + addl %ecx,%edx + jmp 100f +31: + shll $6,%ecx + addl %ecx,%edx + jmp 100f +32: + shll $6,%ecx + leal -8(%ecx,%edx),%edx + jmp 100f +33: + shll $6,%ecx + leal -16(%ecx,%edx),%edx + jmp 100f +34: + shll $6,%ecx + leal -24(%ecx,%edx),%edx + jmp 100f +35: + shll $6,%ecx + leal -32(%ecx,%edx),%edx + jmp 100f +36: + shll $6,%ecx + leal -40(%ecx,%edx),%edx + jmp 100f +37: + shll $6,%ecx + leal -48(%ecx,%edx),%edx + jmp 100f +38: + shll $6,%ecx + leal -56(%ecx,%edx),%edx + jmp 100f +39: + lea (%rdx,%rcx,8),%rdx + jmp 100f +40: + mov %ecx,%edx +100: + sfence + mov %edx,%eax + bts $63,%rax + ret + .previous + + _ASM_MCEXTABLE(0b,30b) + _ASM_MCEXTABLE(1b,31b) + _ASM_MCEXTABLE(2b,32b) + _ASM_MCEXTABLE(3b,33b) + _ASM_MCEXTABLE(4b,34b) + _ASM_MCEXTABLE(9b,35b) + _ASM_MCEXTABLE(10b,36b) + _ASM_MCEXTABLE(11b,37b) + _ASM_MCEXTABLE(12b,38b) + _ASM_MCEXTABLE(18b,39b) + _ASM_MCEXTABLE(21b,40b) +ENDPROC(__mcsafe_copy) +#endif -- 2.1.4 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCHV4 3/3] x86, ras: Add __mcsafe_copy() function to recover from machine checks 2015-12-16 1:30 ` [PATCHV4 3/3] x86, ras: Add __mcsafe_copy() function to recover from machine checks Tony Luck @ 2015-12-24 21:46 ` Borislav Petkov 2015-12-16 1:30 ` [PATCHV5 " Tony Luck 0 siblings, 1 reply; 27+ messages in thread From: Borislav Petkov @ 2015-12-24 21:46 UTC (permalink / raw) To: Tony Luck Cc: Ingo Molnar, Andrew Morton, Andy Lutomirski, Dan Williams, elliott, linux-kernel, linux-mm, linux-nvdimm, x86 On Tue, Dec 15, 2015 at 05:30:49PM -0800, Tony Luck wrote: > Using __copy_user_nocache() as inspiration create a memory copy > routine for use by kernel code with annotations to allow for > recovery from machine checks. > > Notes: > 1) We align the source address rather than the destination. This > means we never have to deal with a memory read that spans two > cache lines ... so we can provide a precise indication of > where the error occurred without having to re-execute at > a byte-by-byte level to find the exact spot like the original > did. > 2) We 'or' BIT(63) into the return because this is the first > in a series of machine check safe functions. Some will copy > from user addresses, so may need to indicate an invalid user > address instead of a machine check. > 3) This code doesn't play any cache games. Future functions can > use non-temporal loads/stores to meet needs of different callers. > 4) Provide helpful macros to decode the return value. > > Signed-off-by: Tony Luck <tony.luck@intel.com> > --- > arch/x86/include/asm/string_64.h | 8 +++ > arch/x86/kernel/x8664_ksyms_64.c | 4 ++ > arch/x86/lib/memcpy_64.S | 133 +++++++++++++++++++++++++++++++++++++++ > 3 files changed, 145 insertions(+) ... > + lea (%rdx,%rcx,8),%rdx > + jmp 100f > +40: > + mov %ecx,%edx > +100: > + sfence > + mov %edx,%eax > + bts $63,%rax > + ret Huh, bit 63 is still alive? Didn't we just talk about having different return values depending on whether a fault or an MCE happened *instead* of setting that bit? You have two "RET" points in that function, why not return a different value from each? -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. ^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCHV5 3/3] x86, ras: Add __mcsafe_copy() function to recover from machine checks 2015-12-24 21:46 ` Borislav Petkov @ 2015-12-16 1:30 ` Tony Luck 2015-12-25 11:49 ` Borislav Petkov 0 siblings, 1 reply; 27+ messages in thread From: Tony Luck @ 2015-12-16 1:30 UTC (permalink / raw) To: Ingo Molnar Cc: Borislav Petkov, Andrew Morton, Andy Lutomirski, Dan Williams, elliott, linux-kernel, linux-mm, linux-nvdimm, x86 Using __copy_user_nocache() as inspiration create a memory copy routine for use by kernel code with annotations to allow for recovery from machine checks. Notes: 1) We align the source address rather than the destination. This means we never have to deal with a memory read that spans two cache lines ... so we can provide a precise indication of where the error occurred without having to re-execute at a byte-by-byte level to find the exact spot like the original did. 2) We 'or' BIT(63) into the return if the copy failed because of a machine check. If we failed during a copy from user space because the user provided a bad address, then we just return then number of bytes not copied like other copy_from_user functions. 3) This code doesn't play any cache games. Future functions can use non-temporal loads/stores to meet needs of different callers. 4) Provide helpful macros to decode the return value. Signed-off-by: Tony Luck <tony.luck@intel.com> --- Boris: This version has all the return options coded. return 0; /* SUCCESS */ return remain_bytes | (1ul << 63); /* failed because of machine check */ return remain_bytes; /* failed because of invalid source address */ arch/x86/include/asm/string_64.h | 8 ++ arch/x86/kernel/x8664_ksyms_64.c | 4 + arch/x86/lib/memcpy_64.S | 202 +++++++++++++++++++++++++++++++++++++++ 3 files changed, 214 insertions(+) diff --git a/arch/x86/include/asm/string_64.h b/arch/x86/include/asm/string_64.h index ff8b9a17dc4b..4359ebb86b86 100644 --- a/arch/x86/include/asm/string_64.h +++ b/arch/x86/include/asm/string_64.h @@ -78,6 +78,14 @@ int strcmp(const char *cs, const char *ct); #define memset(s, c, n) __memset(s, c, n) #endif +#ifdef CONFIG_MCE_KERNEL_RECOVERY +u64 __mcsafe_copy(void *dst, const void __user *src, unsigned size); + +#define COPY_MCHECK_ERRBIT BIT(63) +#define COPY_HAD_MCHECK(ret) ((ret) & COPY_MCHECK_ERRBIT) +#define COPY_MCHECK_REMAIN(ret) ((ret) & ~COPY_MCHECK_ERRBIT) +#endif + #endif /* __KERNEL__ */ #endif /* _ASM_X86_STRING_64_H */ diff --git a/arch/x86/kernel/x8664_ksyms_64.c b/arch/x86/kernel/x8664_ksyms_64.c index a0695be19864..3d42d0ef3333 100644 --- a/arch/x86/kernel/x8664_ksyms_64.c +++ b/arch/x86/kernel/x8664_ksyms_64.c @@ -37,6 +37,10 @@ EXPORT_SYMBOL(__copy_user_nocache); EXPORT_SYMBOL(_copy_from_user); EXPORT_SYMBOL(_copy_to_user); +#ifdef CONFIG_MCE_KERNEL_RECOVERY +EXPORT_SYMBOL(__mcsafe_copy); +#endif + EXPORT_SYMBOL(copy_page); EXPORT_SYMBOL(clear_page); diff --git a/arch/x86/lib/memcpy_64.S b/arch/x86/lib/memcpy_64.S index 16698bba87de..23aac781ce59 100644 --- a/arch/x86/lib/memcpy_64.S +++ b/arch/x86/lib/memcpy_64.S @@ -177,3 +177,205 @@ ENTRY(memcpy_orig) .Lend: retq ENDPROC(memcpy_orig) + +#ifdef CONFIG_MCE_KERNEL_RECOVERY +/* + * __mcsafe_copy - memory copy with machine check exception handling + * Note that we only catch machine checks when reading the source addresses. + * Writes to target are posted and don't generate machine checks. + */ +ENTRY(__mcsafe_copy) + cmpl $8,%edx + jb 20f /* less then 8 bytes, go to byte copy loop */ + + /* check for bad alignment of source */ + movl %esi,%ecx + andl $7,%ecx + jz 102f /* already aligned */ + subl $8,%ecx + negl %ecx + subl %ecx,%edx +0: movb (%rsi),%al + movb %al,(%rdi) + incq %rsi + incq %rdi + decl %ecx + jnz 0b +102: + movl %edx,%ecx + andl $63,%edx + shrl $6,%ecx + jz 17f +1: movq (%rsi),%r8 +2: movq 1*8(%rsi),%r9 +3: movq 2*8(%rsi),%r10 +4: movq 3*8(%rsi),%r11 + mov %r8,(%rdi) + mov %r9,1*8(%rdi) + mov %r10,2*8(%rdi) + mov %r11,3*8(%rdi) +9: movq 4*8(%rsi),%r8 +10: movq 5*8(%rsi),%r9 +11: movq 6*8(%rsi),%r10 +12: movq 7*8(%rsi),%r11 + mov %r8,4*8(%rdi) + mov %r9,5*8(%rdi) + mov %r10,6*8(%rdi) + mov %r11,7*8(%rdi) + leaq 64(%rsi),%rsi + leaq 64(%rdi),%rdi + decl %ecx + jnz 1b +17: movl %edx,%ecx + andl $7,%edx + shrl $3,%ecx + jz 20f +18: movq (%rsi),%r8 + mov %r8,(%rdi) + leaq 8(%rsi),%rsi + leaq 8(%rdi),%rdi + decl %ecx + jnz 18b +20: andl %edx,%edx + jz 23f + movl %edx,%ecx +21: movb (%rsi),%al + movb %al,(%rdi) + incq %rsi + incq %rdi + decl %ecx + jnz 21b +23: xorl %eax,%eax + sfence + /* copy successful. return 0 */ + ret + + .section .fixup,"ax" + /* fixups for machine check */ +30: + addl %ecx,%edx + jmp 100f +31: + shll $6,%ecx + addl %ecx,%edx + jmp 100f +32: + shll $6,%ecx + leal -8(%ecx,%edx),%edx + jmp 100f +33: + shll $6,%ecx + leal -16(%ecx,%edx),%edx + jmp 100f +34: + shll $6,%ecx + leal -24(%ecx,%edx),%edx + jmp 100f +35: + shll $6,%ecx + leal -32(%ecx,%edx),%edx + jmp 100f +36: + shll $6,%ecx + leal -40(%ecx,%edx),%edx + jmp 100f +37: + shll $6,%ecx + leal -48(%ecx,%edx),%edx + jmp 100f +38: + shll $6,%ecx + leal -56(%ecx,%edx),%edx + jmp 100f +39: + lea (%rdx,%rcx,8),%rdx + jmp 100f +40: + mov %ecx,%edx +100: + sfence + /* + * copy failed because of machine check on source address + * return value is number of bytes NOT copied plus we set + * bit 63 to distinguish this from the page fault case below. + */ + mov %edx,%eax + bts $63,%rax + ret + + /* fixups for page fault */ +130: + addl %ecx,%edx + jmp 200f +131: + shll $6,%ecx + addl %ecx,%edx + jmp 200f +132: + shll $6,%ecx + leal -8(%ecx,%edx),%edx + jmp 200f +133: + shll $6,%ecx + leal -16(%ecx,%edx),%edx + jmp 200f +134: + shll $6,%ecx + leal -24(%ecx,%edx),%edx + jmp 200f +135: + shll $6,%ecx + leal -32(%ecx,%edx),%edx + jmp 200f +136: + shll $6,%ecx + leal -40(%ecx,%edx),%edx + jmp 200f +137: + shll $6,%ecx + leal -48(%ecx,%edx),%edx + jmp 200f +138: + shll $6,%ecx + leal -56(%ecx,%edx),%edx + jmp 200f +139: + lea (%rdx,%rcx,8),%rdx + jmp 200f +140: + mov %ecx,%edx +200: + sfence + /* + * copy failed because of page fault on source address + * return value is number of bytes NOT copied. + */ + mov %edx,%eax + ret + .previous + + _ASM_EXTABLE(0b,130b) + _ASM_EXTABLE(1b,131b) + _ASM_EXTABLE(2b,132b) + _ASM_EXTABLE(3b,133b) + _ASM_EXTABLE(4b,134b) + _ASM_EXTABLE(9b,135b) + _ASM_EXTABLE(10b,136b) + _ASM_EXTABLE(11b,137b) + _ASM_EXTABLE(12b,138b) + _ASM_EXTABLE(18b,139b) + _ASM_EXTABLE(21b,140b) + + _ASM_MCEXTABLE(0b,30b) + _ASM_MCEXTABLE(1b,31b) + _ASM_MCEXTABLE(2b,32b) + _ASM_MCEXTABLE(3b,33b) + _ASM_MCEXTABLE(4b,34b) + _ASM_MCEXTABLE(9b,35b) + _ASM_MCEXTABLE(10b,36b) + _ASM_MCEXTABLE(11b,37b) + _ASM_MCEXTABLE(12b,38b) + _ASM_MCEXTABLE(18b,39b) + _ASM_MCEXTABLE(21b,40b) +ENDPROC(__mcsafe_copy) +#endif -- 2.1.4 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCHV5 3/3] x86, ras: Add __mcsafe_copy() function to recover from machine checks 2015-12-16 1:30 ` [PATCHV5 " Tony Luck @ 2015-12-25 11:49 ` Borislav Petkov 2015-12-25 20:05 ` Luck, Tony 0 siblings, 1 reply; 27+ messages in thread From: Borislav Petkov @ 2015-12-25 11:49 UTC (permalink / raw) To: Tony Luck Cc: Ingo Molnar, Andrew Morton, Andy Lutomirski, Dan Williams, elliott, linux-kernel, linux-mm, linux-nvdimm, x86 On Tue, Dec 15, 2015 at 05:30:49PM -0800, Tony Luck wrote: > Using __copy_user_nocache() as inspiration create a memory copy > routine for use by kernel code with annotations to allow for > recovery from machine checks. > > Notes: > 1) We align the source address rather than the destination. This > means we never have to deal with a memory read that spans two > cache lines ... so we can provide a precise indication of > where the error occurred without having to re-execute at > a byte-by-byte level to find the exact spot like the original > did. > 2) We 'or' BIT(63) into the return if the copy failed because of > a machine check. If we failed during a copy from user space > because the user provided a bad address, then we just return > then number of bytes not copied like other copy_from_user > functions. > 3) This code doesn't play any cache games. Future functions can > use non-temporal loads/stores to meet needs of different callers. > 4) Provide helpful macros to decode the return value. > > Signed-off-by: Tony Luck <tony.luck@intel.com> > --- > Boris: This version has all the return options coded. > return 0; /* SUCCESS */ > return remain_bytes | (1ul << 63); /* failed because of machine check */ > return remain_bytes; /* failed because of invalid source address */ Ok, how about a much simpler approach and finally getting rid of that bit 63? :-) Here's what we could do, it is totally untested but at least it builds here (full patch below). So first we define __mcsafe_copy to return two u64 values, or two int values or whatever... Bottomline is, we return 2 values with remain_bytes in %rdx and the actual error in %rax. +struct mcsafe_ret { + u64 ret; + u64 remain; +}; + +struct mcsafe_ret __mcsafe_copy(void *dst, const void __user *src, unsigned size); Then, in fixup_exception()/fixup_mcexception(), we set the *respective* regs->ax (which is mcsafe_ret.ret) depending on which function is fixing up the exception. I've made it return -EINVAL and -EFAULT respectively but those are arbitrary. We detect that we're in __mcsafe_copy() by using its start and a previously defined end label. I've done this in order to get rid of the mce-specific exception tables. Mind you, this is still precise enough since we're using the _ASM_EXTABLE entries from __mcsafe_copy. And this approach gets rid of those mce-specific exception tables, bit 63, makes __mcsafe_copy simpler, you name it... :-) Thoughts? --- diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index adb28a2dab44..efef4d72674c 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1021,6 +1021,16 @@ config X86_MCE_INJECT If you don't know what a machine check is and you don't do kernel QA it is safe to say n. +config MCE_KERNEL_RECOVERY + bool "Recovery from machine checks in special kernel memory copy functions" + default n + depends on X86_MCE && X86_64 + ---help--- + This option provides a new memory copy function mcsafe_memcpy() + that is annotated to allow the machine check handler to return + to an alternate code path to return an error to the caller instead + of crashing the system. Say yes if you have a driver that uses this. + config X86_THERMAL_VECTOR def_bool y depends on X86_MCE_INTEL diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h index 2ea4527e462f..9c5371d1069b 100644 --- a/arch/x86/include/asm/mce.h +++ b/arch/x86/include/asm/mce.h @@ -287,4 +287,13 @@ struct cper_sec_mem_err; extern void apei_mce_report_mem_error(int corrected, struct cper_sec_mem_err *mem_err); +#ifdef CONFIG_MCE_KERNEL_RECOVERY +int fixup_mcexception(struct pt_regs *regs); +#else +static inline int fixup_mcexception(struct pt_regs *regs) +{ + return 0; +} +#endif + #endif /* _ASM_X86_MCE_H */ diff --git a/arch/x86/include/asm/string_64.h b/arch/x86/include/asm/string_64.h index ff8b9a17dc4b..6b6431797749 100644 --- a/arch/x86/include/asm/string_64.h +++ b/arch/x86/include/asm/string_64.h @@ -78,6 +78,16 @@ int strcmp(const char *cs, const char *ct); #define memset(s, c, n) __memset(s, c, n) #endif +#ifdef CONFIG_MCE_KERNEL_RECOVERY +struct mcsafe_ret { + u64 ret; + u64 remain; +}; + +struct mcsafe_ret __mcsafe_copy(void *dst, const void __user *src, unsigned size); +extern void __mcsafe_copy_end(void); +#endif + #endif /* __KERNEL__ */ #endif /* _ASM_X86_STRING_64_H */ diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h b/arch/x86/kernel/cpu/mcheck/mce-internal.h index 547720efd923..e8a2c8067fcb 100644 --- a/arch/x86/kernel/cpu/mcheck/mce-internal.h +++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h @@ -80,4 +80,12 @@ static inline int apei_clear_mce(u64 record_id) } #endif +#ifdef CONFIG_MCE_KERNEL_RECOVERY +static inline bool mce_in_kernel_recov(unsigned long addr) +{ + return (addr >= (unsigned long)__mcsafe_copy && + addr <= (unsigned long)__mcsafe_copy_end); +} +#endif + void mce_inject_log(struct mce *m); diff --git a/arch/x86/kernel/cpu/mcheck/mce-severity.c b/arch/x86/kernel/cpu/mcheck/mce-severity.c index 9c682c222071..a51f0d28cc06 100644 --- a/arch/x86/kernel/cpu/mcheck/mce-severity.c +++ b/arch/x86/kernel/cpu/mcheck/mce-severity.c @@ -29,7 +29,7 @@ * panic situations) */ -enum context { IN_KERNEL = 1, IN_USER = 2 }; +enum context { IN_KERNEL = 1, IN_USER = 2, IN_KERNEL_RECOV = 3 }; enum ser { SER_REQUIRED = 1, NO_SER = 2 }; enum exception { EXCP_CONTEXT = 1, NO_EXCP = 2 }; @@ -48,6 +48,7 @@ static struct severity { #define MCESEV(s, m, c...) { .sev = MCE_ ## s ## _SEVERITY, .msg = m, ## c } #define KERNEL .context = IN_KERNEL #define USER .context = IN_USER +#define KERNEL_RECOV .context = IN_KERNEL_RECOV #define SER .ser = SER_REQUIRED #define NOSER .ser = NO_SER #define EXCP .excp = EXCP_CONTEXT @@ -87,6 +88,10 @@ static struct severity { EXCP, KERNEL, MCGMASK(MCG_STATUS_RIPV, 0) ), MCESEV( + PANIC, "In kernel and no restart IP", + EXCP, KERNEL_RECOV, MCGMASK(MCG_STATUS_RIPV, 0) + ), + MCESEV( DEFERRED, "Deferred error", NOSER, MASK(MCI_STATUS_UC|MCI_STATUS_DEFERRED|MCI_STATUS_POISON, MCI_STATUS_DEFERRED) ), @@ -123,6 +128,11 @@ static struct severity { MCGMASK(MCG_STATUS_RIPV|MCG_STATUS_EIPV, MCG_STATUS_RIPV) ), MCESEV( + AR, "Action required: data load in error recoverable area of kernel", + SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR|MCACOD_DATA), + KERNEL_RECOV + ), + MCESEV( AR, "Action required: data load error in a user process", SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR|MCACOD_DATA), USER @@ -170,6 +180,9 @@ static struct severity { ) /* always matches. keep at end */ }; +#define mc_recoverable(mcg) (((mcg) & (MCG_STATUS_RIPV|MCG_STATUS_EIPV)) == \ + (MCG_STATUS_RIPV|MCG_STATUS_EIPV)) + /* * If mcgstatus indicated that ip/cs on the stack were * no good, then "m->cs" will be zero and we will have @@ -183,7 +196,11 @@ static struct severity { */ static int error_context(struct mce *m) { - return ((m->cs & 3) == 3) ? IN_USER : IN_KERNEL; + if ((m->cs & 3) == 3) + return IN_USER; + if (mc_recoverable(m->mcgstatus) && mce_in_kernel_recov(m->ip)) + return IN_KERNEL_RECOV; + return IN_KERNEL; } /* diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c index a006f4cd792b..3dc2cbc3f62d 100644 --- a/arch/x86/kernel/cpu/mcheck/mce.c +++ b/arch/x86/kernel/cpu/mcheck/mce.c @@ -31,6 +31,7 @@ #include <linux/types.h> #include <linux/slab.h> #include <linux/init.h> +#include <linux/module.h> #include <linux/kmod.h> #include <linux/poll.h> #include <linux/nmi.h> @@ -961,6 +962,20 @@ static void mce_clear_state(unsigned long *toclear) } } +static int do_memory_failure(struct mce *m) +{ + int flags = MF_ACTION_REQUIRED; + int ret; + + pr_err("Uncorrected hardware memory error in user-access at %llx", m->addr); + if (!(m->mcgstatus & MCG_STATUS_RIPV)) + flags |= MF_MUST_KILL; + ret = memory_failure(m->addr >> PAGE_SHIFT, MCE_VECTOR, flags); + if (ret) + pr_err("Memory error not recovered"); + return ret; +} + /* * The actual machine check handler. This only handles real * exceptions when something got corrupted coming in through int 18. @@ -998,8 +1013,6 @@ void do_machine_check(struct pt_regs *regs, long error_code) DECLARE_BITMAP(toclear, MAX_NR_BANKS); DECLARE_BITMAP(valid_banks, MAX_NR_BANKS); char *msg = "Unknown"; - u64 recover_paddr = ~0ull; - int flags = MF_ACTION_REQUIRED; int lmce = 0; /* If this CPU is offline, just bail out. */ @@ -1136,22 +1149,13 @@ void do_machine_check(struct pt_regs *regs, long error_code) } /* - * At insane "tolerant" levels we take no action. Otherwise - * we only die if we have no other choice. For less serious - * issues we try to recover, or limit damage to the current - * process. + * If tolerant is at an insane level we drop requests to kill + * processes and continue even when there is no way out. */ - if (cfg->tolerant < 3) { - if (no_way_out) - mce_panic("Fatal machine check on current CPU", &m, msg); - if (worst == MCE_AR_SEVERITY) { - recover_paddr = m.addr; - if (!(m.mcgstatus & MCG_STATUS_RIPV)) - flags |= MF_MUST_KILL; - } else if (kill_it) { - force_sig(SIGBUS, current); - } - } + if (cfg->tolerant == 3) + kill_it = 0; + else if (no_way_out) + mce_panic("Fatal machine check on current CPU", &m, msg); if (worst > 0) mce_report_event(regs); @@ -1159,25 +1163,24 @@ void do_machine_check(struct pt_regs *regs, long error_code) out: sync_core(); - if (recover_paddr == ~0ull) - goto done; + if (worst != MCE_AR_SEVERITY && !kill_it) + goto out_ist; - pr_err("Uncorrected hardware memory error in user-access at %llx", - recover_paddr); - /* - * We must call memory_failure() here even if the current process is - * doomed. We still need to mark the page as poisoned and alert any - * other users of the page. - */ - ist_begin_non_atomic(regs); - local_irq_enable(); - if (memory_failure(recover_paddr >> PAGE_SHIFT, MCE_VECTOR, flags) < 0) { - pr_err("Memory error not recovered"); - force_sig(SIGBUS, current); + /* Fault was in user mode and we need to take some action */ + if ((m.cs & 3) == 3) { + ist_begin_non_atomic(regs); + local_irq_enable(); + + if (kill_it || do_memory_failure(&m)) + force_sig(SIGBUS, current); + local_irq_disable(); + ist_end_non_atomic(); + } else { + if (!fixup_mcexception(regs)) + mce_panic("Failed kernel mode recovery", &m, NULL); } - local_irq_disable(); - ist_end_non_atomic(); -done: + +out_ist: ist_exit(regs); } EXPORT_SYMBOL_GPL(do_machine_check); diff --git a/arch/x86/kernel/x8664_ksyms_64.c b/arch/x86/kernel/x8664_ksyms_64.c index a0695be19864..3d42d0ef3333 100644 --- a/arch/x86/kernel/x8664_ksyms_64.c +++ b/arch/x86/kernel/x8664_ksyms_64.c @@ -37,6 +37,10 @@ EXPORT_SYMBOL(__copy_user_nocache); EXPORT_SYMBOL(_copy_from_user); EXPORT_SYMBOL(_copy_to_user); +#ifdef CONFIG_MCE_KERNEL_RECOVERY +EXPORT_SYMBOL(__mcsafe_copy); +#endif + EXPORT_SYMBOL(copy_page); EXPORT_SYMBOL(clear_page); diff --git a/arch/x86/lib/memcpy_64.S b/arch/x86/lib/memcpy_64.S index 16698bba87de..2fad83c314cc 100644 --- a/arch/x86/lib/memcpy_64.S +++ b/arch/x86/lib/memcpy_64.S @@ -177,3 +177,140 @@ ENTRY(memcpy_orig) .Lend: retq ENDPROC(memcpy_orig) + +#ifdef CONFIG_MCE_KERNEL_RECOVERY +/* + * __mcsafe_copy - memory copy with machine check exception handling + * Note that we only catch machine checks when reading the source addresses. + * Writes to target are posted and don't generate machine checks. + */ +ENTRY(__mcsafe_copy) + cmpl $8,%edx + jb 20f /* less then 8 bytes, go to byte copy loop */ + + /* check for bad alignment of source */ + movl %esi,%ecx + andl $7,%ecx + jz 102f /* already aligned */ + subl $8,%ecx + negl %ecx + subl %ecx,%edx +0: movb (%rsi),%al + movb %al,(%rdi) + incq %rsi + incq %rdi + decl %ecx + jnz 0b +102: + movl %edx,%ecx + andl $63,%edx + shrl $6,%ecx + jz 17f +1: movq (%rsi),%r8 +2: movq 1*8(%rsi),%r9 +3: movq 2*8(%rsi),%r10 +4: movq 3*8(%rsi),%r11 + mov %r8,(%rdi) + mov %r9,1*8(%rdi) + mov %r10,2*8(%rdi) + mov %r11,3*8(%rdi) +9: movq 4*8(%rsi),%r8 +10: movq 5*8(%rsi),%r9 +11: movq 6*8(%rsi),%r10 +12: movq 7*8(%rsi),%r11 + mov %r8,4*8(%rdi) + mov %r9,5*8(%rdi) + mov %r10,6*8(%rdi) + mov %r11,7*8(%rdi) + leaq 64(%rsi),%rsi + leaq 64(%rdi),%rdi + decl %ecx + jnz 1b +17: movl %edx,%ecx + andl $7,%edx + shrl $3,%ecx + jz 20f +18: movq (%rsi),%r8 + mov %r8,(%rdi) + leaq 8(%rsi),%rsi + leaq 8(%rdi),%rdi + decl %ecx + jnz 18b +20: andl %edx,%edx + jz 23f + movl %edx,%ecx +21: movb (%rsi),%al + movb %al,(%rdi) + incq %rsi + incq %rdi + decl %ecx + jnz 21b +23: xorq %rax, %rax + xorq %rdx, %rdx + sfence + /* copy successful. return 0 */ + ret + + .section .fixup,"ax" + /* fixups for machine check */ +30: + add %ecx,%edx + jmp 100f +31: + shl $6,%ecx + add %ecx,%edx + jmp 100f +32: + shl $6,%ecx + lea -8(%ecx,%edx),%edx + jmp 100f +33: + shl $6,%ecx + lea -16(%ecx,%edx),%edx + jmp 100f +34: + shl $6,%ecx + lea -24(%ecx,%edx),%edx + jmp 100f +35: + shl $6,%ecx + lea -32(%ecx,%edx),%edx + jmp 100f +36: + shl $6,%ecx + lea -40(%ecx,%edx),%edx + jmp 100f +37: + shl $6,%ecx + lea -48(%ecx,%edx),%edx + jmp 100f +38: + shl $6,%ecx + lea -56(%ecx,%edx),%edx + jmp 100f +39: + lea (%rdx,%rcx,8),%rdx + jmp 100f +40: + mov %ecx,%edx +100: + sfence + + /* %rax prepared in fixup_exception()/fixup_mcexception() */ + ret +GLOBAL(__mcsafe_copy_end) + .previous + + _ASM_EXTABLE(0b,30b) + _ASM_EXTABLE(1b,31b) + _ASM_EXTABLE(2b,32b) + _ASM_EXTABLE(3b,33b) + _ASM_EXTABLE(4b,34b) + _ASM_EXTABLE(9b,35b) + _ASM_EXTABLE(10b,36b) + _ASM_EXTABLE(11b,37b) + _ASM_EXTABLE(12b,38b) + _ASM_EXTABLE(18b,39b) + _ASM_EXTABLE(21b,40b) +ENDPROC(__mcsafe_copy) +#endif diff --git a/arch/x86/mm/extable.c b/arch/x86/mm/extable.c index 903ec1e9c326..d0f5600df5e5 100644 --- a/arch/x86/mm/extable.c +++ b/arch/x86/mm/extable.c @@ -2,6 +2,7 @@ #include <linux/spinlock.h> #include <linux/sort.h> #include <asm/uaccess.h> +#include <asm/mce.h> static inline unsigned long ex_insn_addr(const struct exception_table_entry *x) @@ -37,11 +38,18 @@ int fixup_exception(struct pt_regs *regs) if (fixup) { new_ip = ex_fixup_addr(fixup); +#ifdef CONFIG_MCE_KERNEL_RECOVERY + if (regs->ip >= (unsigned long)__mcsafe_copy && + regs->ip <= (unsigned long)__mcsafe_copy_end) + regs->ax = -EFAULT; +#endif + if (fixup->fixup - fixup->insn >= 0x7ffffff0 - 4) { /* Special hack for uaccess_err */ current_thread_info()->uaccess_err = 1; new_ip -= 0x7ffffff0; } + regs->ip = new_ip; return 1; } @@ -49,6 +57,29 @@ int fixup_exception(struct pt_regs *regs) return 0; } +#ifdef CONFIG_MCE_KERNEL_RECOVERY +int fixup_mcexception(struct pt_regs *regs) +{ + const struct exception_table_entry *fixup; + unsigned long new_ip; + + fixup = search_exception_tables(regs->ip); + if (fixup) { + new_ip = ex_fixup_addr(fixup); + + if (regs->ip >= (unsigned long)__mcsafe_copy && + regs->ip <= (unsigned long)__mcsafe_copy_end) + regs->ax = -EINVAL; + + regs->ip = new_ip; + + return 1; + } + + return 0; +} +#endif + /* Restricted version used during very early boot */ int __init early_fixup_exception(unsigned long *ip) { -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCHV5 3/3] x86, ras: Add __mcsafe_copy() function to recover from machine checks 2015-12-25 11:49 ` Borislav Petkov @ 2015-12-25 20:05 ` Luck, Tony 2015-12-26 10:32 ` Borislav Petkov 0 siblings, 1 reply; 27+ messages in thread From: Luck, Tony @ 2015-12-25 20:05 UTC (permalink / raw) To: Borislav Petkov Cc: Ingo Molnar, Andrew Morton, Andy Lutomirski, Williams, Dan J, elliott, linux-kernel, linux-mm, linux-nvdimm, x86 mce_in_kernel_recov() should check whether we have a fix up entry for the specific IP that hit the machine check before rating the severity as kernel recoverable. If we add more functions (for different cache behaviour, or to optimize for specific processor model) we can make sure to put them all together inside begin/end labels. We would run into trouble if we want to have some in-line macros for use from arbitrary C-code like we have for the page fault case. I might make the arbitrary %rax value be #PF and #MC to reflect the h/w fault that got us here rather than -EINVAL/-EFAULT. But that's just bike shedding. But now we are back to having the fault handler poke %rax again, which made Andy twitch before. Sent from my iPhone > On Dec 25, 2015, at 03:49, Borislav Petkov <bp@alien8.de> wrote: > >> On Tue, Dec 15, 2015 at 05:30:49PM -0800, Tony Luck wrote: >> Using __copy_user_nocache() as inspiration create a memory copy >> routine for use by kernel code with annotations to allow for >> recovery from machine checks. >> >> Notes: >> 1) We align the source address rather than the destination. This >> means we never have to deal with a memory read that spans two >> cache lines ... so we can provide a precise indication of >> where the error occurred without having to re-execute at >> a byte-by-byte level to find the exact spot like the original >> did. >> 2) We 'or' BIT(63) into the return if the copy failed because of >> a machine check. If we failed during a copy from user space >> because the user provided a bad address, then we just return >> then number of bytes not copied like other copy_from_user >> functions. >> 3) This code doesn't play any cache games. Future functions can >> use non-temporal loads/stores to meet needs of different callers. >> 4) Provide helpful macros to decode the return value. >> >> Signed-off-by: Tony Luck <tony.luck@intel.com> >> --- >> Boris: This version has all the return options coded. >> return 0; /* SUCCESS */ >> return remain_bytes | (1ul << 63); /* failed because of machine check */ >> return remain_bytes; /* failed because of invalid source address */ > > Ok, how about a much simpler approach and finally getting rid of that > bit 63? :-) > > Here's what we could do, it is totally untested but at least it builds > here (full patch below). > > So first we define __mcsafe_copy to return two u64 values, or two > int values or whatever... Bottomline is, we return 2 values with > remain_bytes in %rdx and the actual error in %rax. > > +struct mcsafe_ret { > + u64 ret; > + u64 remain; > +}; > + > +struct mcsafe_ret __mcsafe_copy(void *dst, const void __user *src, unsigned size); > > Then, in fixup_exception()/fixup_mcexception(), we set the *respective* > regs->ax (which is mcsafe_ret.ret) depending on which function is fixing > up the exception. I've made it return -EINVAL and -EFAULT respectively > but those are arbitrary. > > We detect that we're in __mcsafe_copy() by using its start and a > previously defined end label. I've done this in order to get rid of the > mce-specific exception tables. Mind you, this is still precise enough > since we're using the _ASM_EXTABLE entries from __mcsafe_copy. > > And this approach gets rid of those mce-specific exception tables, bit > 63, makes __mcsafe_copy simpler, you name it... :-) > > Thoughts? > > --- > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > index adb28a2dab44..efef4d72674c 100644 > --- a/arch/x86/Kconfig > +++ b/arch/x86/Kconfig > @@ -1021,6 +1021,16 @@ config X86_MCE_INJECT > If you don't know what a machine check is and you don't do kernel > QA it is safe to say n. > > +config MCE_KERNEL_RECOVERY > + bool "Recovery from machine checks in special kernel memory copy functions" > + default n > + depends on X86_MCE && X86_64 > + ---help--- > + This option provides a new memory copy function mcsafe_memcpy() > + that is annotated to allow the machine check handler to return > + to an alternate code path to return an error to the caller instead > + of crashing the system. Say yes if you have a driver that uses this. > + > config X86_THERMAL_VECTOR > def_bool y > depends on X86_MCE_INTEL > diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h > index 2ea4527e462f..9c5371d1069b 100644 > --- a/arch/x86/include/asm/mce.h > +++ b/arch/x86/include/asm/mce.h > @@ -287,4 +287,13 @@ struct cper_sec_mem_err; > extern void apei_mce_report_mem_error(int corrected, > struct cper_sec_mem_err *mem_err); > > +#ifdef CONFIG_MCE_KERNEL_RECOVERY > +int fixup_mcexception(struct pt_regs *regs); > +#else > +static inline int fixup_mcexception(struct pt_regs *regs) > +{ > + return 0; > +} > +#endif > + > #endif /* _ASM_X86_MCE_H */ > diff --git a/arch/x86/include/asm/string_64.h b/arch/x86/include/asm/string_64.h > index ff8b9a17dc4b..6b6431797749 100644 > --- a/arch/x86/include/asm/string_64.h > +++ b/arch/x86/include/asm/string_64.h > @@ -78,6 +78,16 @@ int strcmp(const char *cs, const char *ct); > #define memset(s, c, n) __memset(s, c, n) > #endif > > +#ifdef CONFIG_MCE_KERNEL_RECOVERY > +struct mcsafe_ret { > + u64 ret; > + u64 remain; > +}; > + > +struct mcsafe_ret __mcsafe_copy(void *dst, const void __user *src, unsigned size); > +extern void __mcsafe_copy_end(void); > +#endif > + > #endif /* __KERNEL__ */ > > #endif /* _ASM_X86_STRING_64_H */ > diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h b/arch/x86/kernel/cpu/mcheck/mce-internal.h > index 547720efd923..e8a2c8067fcb 100644 > --- a/arch/x86/kernel/cpu/mcheck/mce-internal.h > +++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h > @@ -80,4 +80,12 @@ static inline int apei_clear_mce(u64 record_id) > } > #endif > > +#ifdef CONFIG_MCE_KERNEL_RECOVERY > +static inline bool mce_in_kernel_recov(unsigned long addr) > +{ > + return (addr >= (unsigned long)__mcsafe_copy && > + addr <= (unsigned long)__mcsafe_copy_end); > +} > +#endif > + > void mce_inject_log(struct mce *m); > diff --git a/arch/x86/kernel/cpu/mcheck/mce-severity.c b/arch/x86/kernel/cpu/mcheck/mce-severity.c > index 9c682c222071..a51f0d28cc06 100644 > --- a/arch/x86/kernel/cpu/mcheck/mce-severity.c > +++ b/arch/x86/kernel/cpu/mcheck/mce-severity.c > @@ -29,7 +29,7 @@ > * panic situations) > */ > > -enum context { IN_KERNEL = 1, IN_USER = 2 }; > +enum context { IN_KERNEL = 1, IN_USER = 2, IN_KERNEL_RECOV = 3 }; > enum ser { SER_REQUIRED = 1, NO_SER = 2 }; > enum exception { EXCP_CONTEXT = 1, NO_EXCP = 2 }; > > @@ -48,6 +48,7 @@ static struct severity { > #define MCESEV(s, m, c...) { .sev = MCE_ ## s ## _SEVERITY, .msg = m, ## c } > #define KERNEL .context = IN_KERNEL > #define USER .context = IN_USER > +#define KERNEL_RECOV .context = IN_KERNEL_RECOV > #define SER .ser = SER_REQUIRED > #define NOSER .ser = NO_SER > #define EXCP .excp = EXCP_CONTEXT > @@ -87,6 +88,10 @@ static struct severity { > EXCP, KERNEL, MCGMASK(MCG_STATUS_RIPV, 0) > ), > MCESEV( > + PANIC, "In kernel and no restart IP", > + EXCP, KERNEL_RECOV, MCGMASK(MCG_STATUS_RIPV, 0) > + ), > + MCESEV( > DEFERRED, "Deferred error", > NOSER, MASK(MCI_STATUS_UC|MCI_STATUS_DEFERRED|MCI_STATUS_POISON, MCI_STATUS_DEFERRED) > ), > @@ -123,6 +128,11 @@ static struct severity { > MCGMASK(MCG_STATUS_RIPV|MCG_STATUS_EIPV, MCG_STATUS_RIPV) > ), > MCESEV( > + AR, "Action required: data load in error recoverable area of kernel", > + SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR|MCACOD_DATA), > + KERNEL_RECOV > + ), > + MCESEV( > AR, "Action required: data load error in a user process", > SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR|MCACOD_DATA), > USER > @@ -170,6 +180,9 @@ static struct severity { > ) /* always matches. keep at end */ > }; > > +#define mc_recoverable(mcg) (((mcg) & (MCG_STATUS_RIPV|MCG_STATUS_EIPV)) == \ > + (MCG_STATUS_RIPV|MCG_STATUS_EIPV)) > + > /* > * If mcgstatus indicated that ip/cs on the stack were > * no good, then "m->cs" will be zero and we will have > @@ -183,7 +196,11 @@ static struct severity { > */ > static int error_context(struct mce *m) > { > - return ((m->cs & 3) == 3) ? IN_USER : IN_KERNEL; > + if ((m->cs & 3) == 3) > + return IN_USER; > + if (mc_recoverable(m->mcgstatus) && mce_in_kernel_recov(m->ip)) > + return IN_KERNEL_RECOV; > + return IN_KERNEL; > } > > /* > diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c > index a006f4cd792b..3dc2cbc3f62d 100644 > --- a/arch/x86/kernel/cpu/mcheck/mce.c > +++ b/arch/x86/kernel/cpu/mcheck/mce.c > @@ -31,6 +31,7 @@ > #include <linux/types.h> > #include <linux/slab.h> > #include <linux/init.h> > +#include <linux/module.h> > #include <linux/kmod.h> > #include <linux/poll.h> > #include <linux/nmi.h> > @@ -961,6 +962,20 @@ static void mce_clear_state(unsigned long *toclear) > } > } > > +static int do_memory_failure(struct mce *m) > +{ > + int flags = MF_ACTION_REQUIRED; > + int ret; > + > + pr_err("Uncorrected hardware memory error in user-access at %llx", m->addr); > + if (!(m->mcgstatus & MCG_STATUS_RIPV)) > + flags |= MF_MUST_KILL; > + ret = memory_failure(m->addr >> PAGE_SHIFT, MCE_VECTOR, flags); > + if (ret) > + pr_err("Memory error not recovered"); > + return ret; > +} > + > /* > * The actual machine check handler. This only handles real > * exceptions when something got corrupted coming in through int 18. > @@ -998,8 +1013,6 @@ void do_machine_check(struct pt_regs *regs, long error_code) > DECLARE_BITMAP(toclear, MAX_NR_BANKS); > DECLARE_BITMAP(valid_banks, MAX_NR_BANKS); > char *msg = "Unknown"; > - u64 recover_paddr = ~0ull; > - int flags = MF_ACTION_REQUIRED; > int lmce = 0; > > /* If this CPU is offline, just bail out. */ > @@ -1136,22 +1149,13 @@ void do_machine_check(struct pt_regs *regs, long error_code) > } > > /* > - * At insane "tolerant" levels we take no action. Otherwise > - * we only die if we have no other choice. For less serious > - * issues we try to recover, or limit damage to the current > - * process. > + * If tolerant is at an insane level we drop requests to kill > + * processes and continue even when there is no way out. > */ > - if (cfg->tolerant < 3) { > - if (no_way_out) > - mce_panic("Fatal machine check on current CPU", &m, msg); > - if (worst == MCE_AR_SEVERITY) { > - recover_paddr = m.addr; > - if (!(m.mcgstatus & MCG_STATUS_RIPV)) > - flags |= MF_MUST_KILL; > - } else if (kill_it) { > - force_sig(SIGBUS, current); > - } > - } > + if (cfg->tolerant == 3) > + kill_it = 0; > + else if (no_way_out) > + mce_panic("Fatal machine check on current CPU", &m, msg); > > if (worst > 0) > mce_report_event(regs); > @@ -1159,25 +1163,24 @@ void do_machine_check(struct pt_regs *regs, long error_code) > out: > sync_core(); > > - if (recover_paddr == ~0ull) > - goto done; > + if (worst != MCE_AR_SEVERITY && !kill_it) > + goto out_ist; > > - pr_err("Uncorrected hardware memory error in user-access at %llx", > - recover_paddr); > - /* > - * We must call memory_failure() here even if the current process is > - * doomed. We still need to mark the page as poisoned and alert any > - * other users of the page. > - */ > - ist_begin_non_atomic(regs); > - local_irq_enable(); > - if (memory_failure(recover_paddr >> PAGE_SHIFT, MCE_VECTOR, flags) < 0) { > - pr_err("Memory error not recovered"); > - force_sig(SIGBUS, current); > + /* Fault was in user mode and we need to take some action */ > + if ((m.cs & 3) == 3) { > + ist_begin_non_atomic(regs); > + local_irq_enable(); > + > + if (kill_it || do_memory_failure(&m)) > + force_sig(SIGBUS, current); > + local_irq_disable(); > + ist_end_non_atomic(); > + } else { > + if (!fixup_mcexception(regs)) > + mce_panic("Failed kernel mode recovery", &m, NULL); > } > - local_irq_disable(); > - ist_end_non_atomic(); > -done: > + > +out_ist: > ist_exit(regs); > } > EXPORT_SYMBOL_GPL(do_machine_check); > diff --git a/arch/x86/kernel/x8664_ksyms_64.c b/arch/x86/kernel/x8664_ksyms_64.c > index a0695be19864..3d42d0ef3333 100644 > --- a/arch/x86/kernel/x8664_ksyms_64.c > +++ b/arch/x86/kernel/x8664_ksyms_64.c > @@ -37,6 +37,10 @@ EXPORT_SYMBOL(__copy_user_nocache); > EXPORT_SYMBOL(_copy_from_user); > EXPORT_SYMBOL(_copy_to_user); > > +#ifdef CONFIG_MCE_KERNEL_RECOVERY > +EXPORT_SYMBOL(__mcsafe_copy); > +#endif > + > EXPORT_SYMBOL(copy_page); > EXPORT_SYMBOL(clear_page); > > diff --git a/arch/x86/lib/memcpy_64.S b/arch/x86/lib/memcpy_64.S > index 16698bba87de..2fad83c314cc 100644 > --- a/arch/x86/lib/memcpy_64.S > +++ b/arch/x86/lib/memcpy_64.S > @@ -177,3 +177,140 @@ ENTRY(memcpy_orig) > .Lend: > retq > ENDPROC(memcpy_orig) > + > +#ifdef CONFIG_MCE_KERNEL_RECOVERY > +/* > + * __mcsafe_copy - memory copy with machine check exception handling > + * Note that we only catch machine checks when reading the source addresses. > + * Writes to target are posted and don't generate machine checks. > + */ > +ENTRY(__mcsafe_copy) > + cmpl $8,%edx > + jb 20f /* less then 8 bytes, go to byte copy loop */ > + > + /* check for bad alignment of source */ > + movl %esi,%ecx > + andl $7,%ecx > + jz 102f /* already aligned */ > + subl $8,%ecx > + negl %ecx > + subl %ecx,%edx > +0: movb (%rsi),%al > + movb %al,(%rdi) > + incq %rsi > + incq %rdi > + decl %ecx > + jnz 0b > +102: > + movl %edx,%ecx > + andl $63,%edx > + shrl $6,%ecx > + jz 17f > +1: movq (%rsi),%r8 > +2: movq 1*8(%rsi),%r9 > +3: movq 2*8(%rsi),%r10 > +4: movq 3*8(%rsi),%r11 > + mov %r8,(%rdi) > + mov %r9,1*8(%rdi) > + mov %r10,2*8(%rdi) > + mov %r11,3*8(%rdi) > +9: movq 4*8(%rsi),%r8 > +10: movq 5*8(%rsi),%r9 > +11: movq 6*8(%rsi),%r10 > +12: movq 7*8(%rsi),%r11 > + mov %r8,4*8(%rdi) > + mov %r9,5*8(%rdi) > + mov %r10,6*8(%rdi) > + mov %r11,7*8(%rdi) > + leaq 64(%rsi),%rsi > + leaq 64(%rdi),%rdi > + decl %ecx > + jnz 1b > +17: movl %edx,%ecx > + andl $7,%edx > + shrl $3,%ecx > + jz 20f > +18: movq (%rsi),%r8 > + mov %r8,(%rdi) > + leaq 8(%rsi),%rsi > + leaq 8(%rdi),%rdi > + decl %ecx > + jnz 18b > +20: andl %edx,%edx > + jz 23f > + movl %edx,%ecx > +21: movb (%rsi),%al > + movb %al,(%rdi) > + incq %rsi > + incq %rdi > + decl %ecx > + jnz 21b > +23: xorq %rax, %rax > + xorq %rdx, %rdx > + sfence > + /* copy successful. return 0 */ > + ret > + > + .section .fixup,"ax" > + /* fixups for machine check */ > +30: > + add %ecx,%edx > + jmp 100f > +31: > + shl $6,%ecx > + add %ecx,%edx > + jmp 100f > +32: > + shl $6,%ecx > + lea -8(%ecx,%edx),%edx > + jmp 100f > +33: > + shl $6,%ecx > + lea -16(%ecx,%edx),%edx > + jmp 100f > +34: > + shl $6,%ecx > + lea -24(%ecx,%edx),%edx > + jmp 100f > +35: > + shl $6,%ecx > + lea -32(%ecx,%edx),%edx > + jmp 100f > +36: > + shl $6,%ecx > + lea -40(%ecx,%edx),%edx > + jmp 100f > +37: > + shl $6,%ecx > + lea -48(%ecx,%edx),%edx > + jmp 100f > +38: > + shl $6,%ecx > + lea -56(%ecx,%edx),%edx > + jmp 100f > +39: > + lea (%rdx,%rcx,8),%rdx > + jmp 100f > +40: > + mov %ecx,%edx > +100: > + sfence > + > + /* %rax prepared in fixup_exception()/fixup_mcexception() */ > + ret > +GLOBAL(__mcsafe_copy_end) > + .previous > + > + _ASM_EXTABLE(0b,30b) > + _ASM_EXTABLE(1b,31b) > + _ASM_EXTABLE(2b,32b) > + _ASM_EXTABLE(3b,33b) > + _ASM_EXTABLE(4b,34b) > + _ASM_EXTABLE(9b,35b) > + _ASM_EXTABLE(10b,36b) > + _ASM_EXTABLE(11b,37b) > + _ASM_EXTABLE(12b,38b) > + _ASM_EXTABLE(18b,39b) > + _ASM_EXTABLE(21b,40b) > +ENDPROC(__mcsafe_copy) > +#endif > diff --git a/arch/x86/mm/extable.c b/arch/x86/mm/extable.c > index 903ec1e9c326..d0f5600df5e5 100644 > --- a/arch/x86/mm/extable.c > +++ b/arch/x86/mm/extable.c > @@ -2,6 +2,7 @@ > #include <linux/spinlock.h> > #include <linux/sort.h> > #include <asm/uaccess.h> > +#include <asm/mce.h> > > static inline unsigned long > ex_insn_addr(const struct exception_table_entry *x) > @@ -37,11 +38,18 @@ int fixup_exception(struct pt_regs *regs) > if (fixup) { > new_ip = ex_fixup_addr(fixup); > > +#ifdef CONFIG_MCE_KERNEL_RECOVERY > + if (regs->ip >= (unsigned long)__mcsafe_copy && > + regs->ip <= (unsigned long)__mcsafe_copy_end) > + regs->ax = -EFAULT; > +#endif > + > if (fixup->fixup - fixup->insn >= 0x7ffffff0 - 4) { > /* Special hack for uaccess_err */ > current_thread_info()->uaccess_err = 1; > new_ip -= 0x7ffffff0; > } > + > regs->ip = new_ip; > return 1; > } > @@ -49,6 +57,29 @@ int fixup_exception(struct pt_regs *regs) > return 0; > } > > +#ifdef CONFIG_MCE_KERNEL_RECOVERY > +int fixup_mcexception(struct pt_regs *regs) > +{ > + const struct exception_table_entry *fixup; > + unsigned long new_ip; > + > + fixup = search_exception_tables(regs->ip); > + if (fixup) { > + new_ip = ex_fixup_addr(fixup); > + > + if (regs->ip >= (unsigned long)__mcsafe_copy && > + regs->ip <= (unsigned long)__mcsafe_copy_end) > + regs->ax = -EINVAL; > + > + regs->ip = new_ip; > + > + return 1; > + } > + > + return 0; > +} > +#endif > + > /* Restricted version used during very early boot */ > int __init early_fixup_exception(unsigned long *ip) > { > > -- > Regards/Gruss, > Boris. > > ECO tip #101: Trim your mails when you reply. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCHV5 3/3] x86, ras: Add __mcsafe_copy() function to recover from machine checks 2015-12-25 20:05 ` Luck, Tony @ 2015-12-26 10:32 ` Borislav Petkov 2015-12-26 14:54 ` Andy Lutomirski 0 siblings, 1 reply; 27+ messages in thread From: Borislav Petkov @ 2015-12-26 10:32 UTC (permalink / raw) To: Luck, Tony Cc: Ingo Molnar, Andrew Morton, Andy Lutomirski, Williams, Dan J, elliott, linux-kernel, linux-mm, linux-nvdimm, x86 On Fri, Dec 25, 2015 at 08:05:39PM +0000, Luck, Tony wrote: > mce_in_kernel_recov() should check whether we have a fix up entry for > the specific IP that hit the machine check before rating the severity > as kernel recoverable. Yeah, it is not precise right now. But this is easy - I'll change it to a simpler version of fixup_mcexception() to iterate over the exception table. > If we add more functions (for different cache behaviour, or to > optimize for specific processor model) we can make sure to put them > all together inside begin/end labels. Yeah, I think we can do even better than that as all the info is in the ELF file already. For example, ENDPROC(__mcsafe_copy) generates .type __mcsafe_copy, @function ; .size __mcsafe_copy, .-__mcsafe_copy and there's the size of the function, I guess we can macroize something like that or even parse the ELF file: $ readelf --syms vmlinux | grep mcsafe 706: ffffffff819df73e 14 OBJECT LOCAL DEFAULT 11 __kstrtab___mcsafe_copy 707: ffffffff819d0e18 8 OBJECT LOCAL DEFAULT 9 __kcrctab___mcsafe_copy 56107: ffffffff819b3bb0 16 OBJECT GLOBAL DEFAULT 7 __ksymtab___mcsafe_copy 58581: ffffffff812e6d70 179 FUNC GLOBAL DEFAULT 1 __mcsafe_copy 62233: 000000003313f9d4 0 NOTYPE GLOBAL DEFAULT ABS __crc___mcsafe_copy 68818: ffffffff812e6e23 0 NOTYPE GLOBAL DEFAULT 1 __mcsafe_copy_end __mcsafe_copy is of size 179 bytes: 0xffffffff812e6d70 + 179 = 0xffffffff812e6e23 which is __mcsafe_copy_end so those labels should not really be necessary as they're global and polluting the binary unnecessarily. > We would run into trouble if we want to have some in-line macros for > use from arbitrary C-code like we have for the page fault case. Example? > I might make the arbitrary %rax value be #PF and #MC to reflect the > h/w fault that got us here rather than -EINVAL/-EFAULT. But that's > just bike shedding. Yeah, I picked those arbitrarily to show the intention. > But now we are back to having the fault handler poke %rax again, which > made Andy twitch before. Andy, why is that? It makes the exception handling much simpler this way... -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCHV5 3/3] x86, ras: Add __mcsafe_copy() function to recover from machine checks 2015-12-26 10:32 ` Borislav Petkov @ 2015-12-26 14:54 ` Andy Lutomirski 2015-12-27 2:08 ` Tony Luck 0 siblings, 1 reply; 27+ messages in thread From: Andy Lutomirski @ 2015-12-26 14:54 UTC (permalink / raw) To: Borislav Petkov Cc: linux-nvdimm, X86 ML, elliott, Luck, Tony, linux-mm, Andrew Morton, Williams, Dan J, Ingo Molnar, linux-kernel On Dec 26, 2015 6:33 PM, "Borislav Petkov" <bp@alien8.de> wrote: > > On Fri, Dec 25, 2015 at 08:05:39PM +0000, Luck, Tony wrote: > > mce_in_kernel_recov() should check whether we have a fix up entry for > > the specific IP that hit the machine check before rating the severity > > as kernel recoverable. > > Yeah, it is not precise right now. But this is easy - I'll change it to > a simpler version of fixup_mcexception() to iterate over the exception > table. > > > If we add more functions (for different cache behaviour, or to > > optimize for specific processor model) we can make sure to put them > > all together inside begin/end labels. > > Yeah, I think we can do even better than that as all the info is in the > ELF file already. For example, ENDPROC(__mcsafe_copy) generates > > .type __mcsafe_copy, @function ; .size __mcsafe_copy, .-__mcsafe_copy > > and there's the size of the function, I guess we can macroize something > like that or even parse the ELF file: > > $ readelf --syms vmlinux | grep mcsafe > 706: ffffffff819df73e 14 OBJECT LOCAL DEFAULT 11 __kstrtab___mcsafe_copy > 707: ffffffff819d0e18 8 OBJECT LOCAL DEFAULT 9 __kcrctab___mcsafe_copy > 56107: ffffffff819b3bb0 16 OBJECT GLOBAL DEFAULT 7 __ksymtab___mcsafe_copy > 58581: ffffffff812e6d70 179 FUNC GLOBAL DEFAULT 1 __mcsafe_copy > 62233: 000000003313f9d4 0 NOTYPE GLOBAL DEFAULT ABS __crc___mcsafe_copy > 68818: ffffffff812e6e23 0 NOTYPE GLOBAL DEFAULT 1 __mcsafe_copy_end > > __mcsafe_copy is of size 179 bytes: > > 0xffffffff812e6d70 + 179 = 0xffffffff812e6e23 which is __mcsafe_copy_end > so those labels should not really be necessary as they're global and > polluting the binary unnecessarily. > > > We would run into trouble if we want to have some in-line macros for > > use from arbitrary C-code like we have for the page fault case. > > Example? > > > I might make the arbitrary %rax value be #PF and #MC to reflect the > > h/w fault that got us here rather than -EINVAL/-EFAULT. But that's > > just bike shedding. > > Yeah, I picked those arbitrarily to show the intention. > > > But now we are back to having the fault handler poke %rax again, which > > made Andy twitch before. > > Andy, why is that? It makes the exception handling much simpler this way... > I like the idea of moving more logic into C, but I don't like splitting the logic across files and adding nasty special cases like this. But what if we generalized it? An extable entry gives a fault IP and a landing pad IP. Surely we can squeeze a flag bit in there. If you set the bit, you get an extended extable entry. Instead of storing a landing pad, it stores a pointer to a handler descriptor: struct extable_handler { bool (*handler)(struct pt_regs *, struct extable_handler *, ...): }; handler returns true if it handled the error and false if it didn't. The "..." encodes the fault number, error code, cr2, etc. Maybe it would be "unsigned long exception, const struct extable_info *info" where extable_info contains a union? I really wish C would grow up and learn about union types. Now the copy routine can do whatever it pleases, in C, locally. For example, if you set up a full stack frame (or even just a known SP offset), you could unwind it in C and just return a value directly, or, even better, you could manually tail-call a C fixup that goes one byte at a time instead of writing that mess in asm. Like this, assuming I got it right: regs->sp = regs->bp; regs->bp = *(unsigned long *)regs->sp; regs->sp += sizeof(unsigned long); regs->ip = fix_it; regs->di = something useful? Bonus points if you can figure out a clean way to register a handler for an IP range without bloating struct extable_entry. --Andy P.S. this mechanism could potentially clean up some entry nastiness, too. P.P.S. Why the hell doesn't *user* code have a mechanism like this? Windows does, and it's been around for longer than I've known how to write C code... ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCHV5 3/3] x86, ras: Add __mcsafe_copy() function to recover from machine checks 2015-12-26 14:54 ` Andy Lutomirski @ 2015-12-27 2:08 ` Tony Luck 2015-12-27 2:15 ` Andy Lutomirski 0 siblings, 1 reply; 27+ messages in thread From: Tony Luck @ 2015-12-27 2:08 UTC (permalink / raw) To: Andy Lutomirski Cc: Borislav Petkov, linux-nvdimm, X86 ML, elliott, linux-mm, Andrew Morton, Williams, Dan J, Ingo Molnar, linux-kernel On Sat, Dec 26, 2015 at 6:54 AM, Andy Lutomirski <luto@amacapital.net> wrote: > On Dec 26, 2015 6:33 PM, "Borislav Petkov" <bp@alien8.de> wrote: >> Andy, why is that? It makes the exception handling much simpler this way... >> > > I like the idea of moving more logic into C, but I don't like > splitting the logic across files and adding nasty special cases like > this. > > But what if we generalized it? An extable entry gives a fault IP and > a landing pad IP. Surely we can squeeze a flag bit in there. The clever squeezers have already been here. Instead of a pair of 64-bit values for fault_ip and fixup_ip they managed with a pair of 32-bit values that are each the relative offset of the desired address from the table location itself. We could make one of them 31-bits (since even an "allyesconfig" kernel is still much smaller than a gigabyte) to free a bit for a flag. But there are those external tools to pre-sort exception tables that would all need to be fixed too. Or we could direct the new fixups into a .fixup2 ELF section and put begin/end labels around that ... so we could check the address of the fixup to see whether it is a legacy or new format entry. > set the bit, you get an extended extable entry. Instead of storing a > landing pad, it stores a pointer to a handler descriptor: > > struct extable_handler { > bool (*handler)(struct pt_regs *, struct extable_handler *, ...): > }; > > handler returns true if it handled the error and false if it didn't. It may be had to call that from the machine check handler ... the beauty of just patching the IP and returning from the handler was that it got us out of machine check context. > The "..." encodes the fault number, error code, cr2, etc. Maybe it > would be "unsigned long exception, const struct extable_info *info" > where extable_info contains a union? I really wish C would grow up > and learn about union types. All this is made more difficult because the h/w doesn't give us all the things we might want to know (e.g. the virtual address). We just have a physical address (which may be missing some low order bits). -Tony ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCHV5 3/3] x86, ras: Add __mcsafe_copy() function to recover from machine checks 2015-12-27 2:08 ` Tony Luck @ 2015-12-27 2:15 ` Andy Lutomirski 2015-12-27 2:16 ` Andy Lutomirski 0 siblings, 1 reply; 27+ messages in thread From: Andy Lutomirski @ 2015-12-27 2:15 UTC (permalink / raw) To: Tony Luck Cc: Borislav Petkov, linux-nvdimm, X86 ML, elliott, linux-mm, Andrew Morton, Williams, Dan J, Ingo Molnar, linux-kernel On Sat, Dec 26, 2015 at 6:08 PM, Tony Luck <tony.luck@gmail.com> wrote: > On Sat, Dec 26, 2015 at 6:54 AM, Andy Lutomirski <luto@amacapital.net> wrote: >> On Dec 26, 2015 6:33 PM, "Borislav Petkov" <bp@alien8.de> wrote: >>> Andy, why is that? It makes the exception handling much simpler this way... >>> >> >> I like the idea of moving more logic into C, but I don't like >> splitting the logic across files and adding nasty special cases like >> this. >> >> But what if we generalized it? An extable entry gives a fault IP and >> a landing pad IP. Surely we can squeeze a flag bit in there. > > The clever squeezers have already been here. Instead of a pair > of 64-bit values for fault_ip and fixup_ip they managed with a pair > of 32-bit values that are each the relative offset of the desired address > from the table location itself. > > We could make one of them 31-bits (since even an "allyesconfig" kernel > is still much smaller than a gigabyte) to free a bit for a flag. But there > are those external tools to pre-sort exception tables that would all > need to be fixed too. > > Or we could direct the new fixups into a .fixup2 ELF section and put > begin/end labels around that ... so we could check the address of the > fixup to see whether it is a legacy or new format entry. > Either of those sounds good to me. >> set the bit, you get an extended extable entry. Instead of storing a >> landing pad, it stores a pointer to a handler descriptor: >> >> struct extable_handler { >> bool (*handler)(struct pt_regs *, struct extable_handler *, ...): >> }; >> >> handler returns true if it handled the error and false if it didn't. > > It may be had to call that from the machine check handler ... the > beauty of just patching the IP and returning from the handler was > that it got us out of machine check context. Your handler will need to know that it's in machine check context :) In most cases (e.g. yours), the handler should just modify regs and return. > >> The "..." encodes the fault number, error code, cr2, etc. Maybe it >> would be "unsigned long exception, const struct extable_info *info" >> where extable_info contains a union? I really wish C would grow up >> and learn about union types. > > All this is made more difficult because the h/w doesn't give us > all the things we might want to know (e.g. the virtual address). > We just have a physical address (which may be missing some > low order bits). True. I'm afraid that nothing I suggest can possibly help you there. Anyhow, this could be a decent signature: bool (*handler)(struct pt_regs *, struct extable_handler *, unsigned int exception, unsigned long error_code, unsigned long extra): If exception is X86_TRAP_PF, then extra is CR2. If exception is X86_TRAP_MC, then extra is however much of the PA you know. --Andy ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCHV5 3/3] x86, ras: Add __mcsafe_copy() function to recover from machine checks 2015-12-27 2:15 ` Andy Lutomirski @ 2015-12-27 2:16 ` Andy Lutomirski 2015-12-27 6:57 ` Tony Luck 0 siblings, 1 reply; 27+ messages in thread From: Andy Lutomirski @ 2015-12-27 2:16 UTC (permalink / raw) To: Tony Luck Cc: Borislav Petkov, linux-nvdimm, X86 ML, elliott, linux-mm, Andrew Morton, Williams, Dan J, Ingo Molnar, linux-kernel On Sat, Dec 26, 2015 at 6:15 PM, Andy Lutomirski <luto@amacapital.net> wrote: > On Sat, Dec 26, 2015 at 6:08 PM, Tony Luck <tony.luck@gmail.com> wrote: >> On Sat, Dec 26, 2015 at 6:54 AM, Andy Lutomirski <luto@amacapital.net> wrote: >>> On Dec 26, 2015 6:33 PM, "Borislav Petkov" <bp@alien8.de> wrote: >>>> Andy, why is that? It makes the exception handling much simpler this way... >>>> >>> >>> I like the idea of moving more logic into C, but I don't like >>> splitting the logic across files and adding nasty special cases like >>> this. >>> >>> But what if we generalized it? An extable entry gives a fault IP and >>> a landing pad IP. Surely we can squeeze a flag bit in there. >> >> The clever squeezers have already been here. Instead of a pair >> of 64-bit values for fault_ip and fixup_ip they managed with a pair >> of 32-bit values that are each the relative offset of the desired address >> from the table location itself. >> >> We could make one of them 31-bits (since even an "allyesconfig" kernel >> is still much smaller than a gigabyte) to free a bit for a flag. But there >> are those external tools to pre-sort exception tables that would all >> need to be fixed too. Wait, why? The external tools sort by source address, and we'd squeeze the flag into the target address, no? --Andy ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCHV5 3/3] x86, ras: Add __mcsafe_copy() function to recover from machine checks 2015-12-27 2:16 ` Andy Lutomirski @ 2015-12-27 6:57 ` Tony Luck 2015-12-27 10:09 ` Borislav Petkov 2015-12-27 12:18 ` Andy Lutomirski 0 siblings, 2 replies; 27+ messages in thread From: Tony Luck @ 2015-12-27 6:57 UTC (permalink / raw) To: Andy Lutomirski Cc: Borislav Petkov, linux-nvdimm, X86 ML, elliott, linux-mm, Andrew Morton, Williams, Dan J, Ingo Molnar, linux-kernel On Sat, Dec 26, 2015 at 6:16 PM, Andy Lutomirski <luto@amacapital.net> wrote: >>> We could make one of them 31-bits (since even an "allyesconfig" kernel >>> is still much smaller than a gigabyte) to free a bit for a flag. But there >>> are those external tools to pre-sort exception tables that would all >>> need to be fixed too. > > Wait, why? The external tools sort by source address, and we'd > squeeze the flag into the target address, no? I was thinking that we'd need to recompute the fixup when we move the entry to its new sorted location. So that: ex_fixup_addr(const struct exception_table_entry *x) { return (unsigned long)&x->fixup + x->fixup; } will get the right value. Maybe this would still work out if the fixup is a 31-bit value plus a flag, but the external tool thinks it is a 32-bit value? I'd have to ponder that. -Tony ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCHV5 3/3] x86, ras: Add __mcsafe_copy() function to recover from machine checks 2015-12-27 6:57 ` Tony Luck @ 2015-12-27 10:09 ` Borislav Petkov 2015-12-27 12:19 ` Andy Lutomirski 2015-12-27 12:18 ` Andy Lutomirski 1 sibling, 1 reply; 27+ messages in thread From: Borislav Petkov @ 2015-12-27 10:09 UTC (permalink / raw) To: Tony Luck, Andy Lutomirski Cc: linux-nvdimm, X86 ML, elliott, linux-mm, Andrew Morton, Williams, Dan J, Ingo Molnar, linux-kernel On Sat, Dec 26, 2015 at 10:57:26PM -0800, Tony Luck wrote: > ... will get the right value. Maybe this would still work out > if the fixup is a 31-bit value plus a flag, but the external > tool thinks it is a 32-bit value? I'd have to ponder that. I still fail to see why do we need to make it so complicated and can't do something like: fixup_exception: ... #ifdef CONFIG_MCE_KERNEL_RECOVERY if (regs->ip >= (unsigned long)__mcsafe_copy && regs->ip <= (unsigned long)__mcsafe_copy_end) run_special_handler(); #endif and that special handler does all the stuff we want. And we pass X86_TRAP* etc through fixup_exception along with whatever else we need from the trap handler... Hmmm? -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCHV5 3/3] x86, ras: Add __mcsafe_copy() function to recover from machine checks 2015-12-27 10:09 ` Borislav Petkov @ 2015-12-27 12:19 ` Andy Lutomirski 2015-12-27 13:17 ` Boris Petkov 0 siblings, 1 reply; 27+ messages in thread From: Andy Lutomirski @ 2015-12-27 12:19 UTC (permalink / raw) To: Borislav Petkov Cc: Tony Luck, linux-nvdimm, X86 ML, elliott, linux-mm, Andrew Morton, Williams, Dan J, Ingo Molnar, linux-kernel On Sun, Dec 27, 2015 at 2:09 AM, Borislav Petkov <bp@alien8.de> wrote: > On Sat, Dec 26, 2015 at 10:57:26PM -0800, Tony Luck wrote: >> ... will get the right value. Maybe this would still work out >> if the fixup is a 31-bit value plus a flag, but the external >> tool thinks it is a 32-bit value? I'd have to ponder that. > > I still fail to see why do we need to make it so complicated and can't > do something like: > > > fixup_exception: > ... > > #ifdef CONFIG_MCE_KERNEL_RECOVERY > if (regs->ip >= (unsigned long)__mcsafe_copy && > regs->ip <= (unsigned long)__mcsafe_copy_end) > run_special_handler(); > #endif > > and that special handler does all the stuff we want. And we pass > X86_TRAP* etc through fixup_exception along with whatever else we > need from the trap handler... > > Hmmm? You certainly can, but it doesn't scale well to multiple users of similar mechanisms. It also prevents you from using the same mechanism in anything that could be inlined, which is IMO kind of unfortunate. --Andy ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCHV5 3/3] x86, ras: Add __mcsafe_copy() function to recover from machine checks 2015-12-27 12:19 ` Andy Lutomirski @ 2015-12-27 13:17 ` Boris Petkov 2015-12-27 13:25 ` Andy Lutomirski 0 siblings, 1 reply; 27+ messages in thread From: Boris Petkov @ 2015-12-27 13:17 UTC (permalink / raw) To: Andy Lutomirski Cc: Tony Luck, linux-nvdimm, X86 ML, elliott, linux-mm, Andrew Morton, Williams, Dan J, Ingo Molnar, linux-kernel Andy Lutomirski <luto@amacapital.net> wrote: >You certainly can, but it doesn't scale well to multiple users of >similar mechanisms. It also prevents you from using the same >mechanism in anything that could be inlined, which is IMO kind of >unfortunate. Well, but the bit 31 game doesn't make it any better than the bit 63 fun IMO. Should the exception table entry maybe grow a u32 flags instead? -- Sent from a small device: formatting sux and brevity is inevitable. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCHV5 3/3] x86, ras: Add __mcsafe_copy() function to recover from machine checks 2015-12-27 13:17 ` Boris Petkov @ 2015-12-27 13:25 ` Andy Lutomirski 2015-12-27 13:33 ` Borislav Petkov 0 siblings, 1 reply; 27+ messages in thread From: Andy Lutomirski @ 2015-12-27 13:25 UTC (permalink / raw) To: Boris Petkov Cc: Tony Luck, linux-nvdimm, X86 ML, elliott, linux-mm, Andrew Morton, Williams, Dan J, Ingo Molnar, linux-kernel On Sun, Dec 27, 2015 at 5:17 AM, Boris Petkov <bp@alien8.de> wrote: > Andy Lutomirski <luto@amacapital.net> wrote: >>You certainly can, but it doesn't scale well to multiple users of >>similar mechanisms. It also prevents you from using the same >>mechanism in anything that could be inlined, which is IMO kind of >>unfortunate. > > Well, but the bit 31 game doesn't make it any better than the bit 63 fun IMO. Should the exception table entry maybe grow a u32 flags instead? > That could significantly bloat the kernel image. Anyway, the bit 31 game isn't so bad IMO because it's localized to the extable macros and the extable reader, whereas the bit 63 thing is all tangled up with the __mcsafe_copy thing, and that's just the first user of a more general mechanism. Did you see this: https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=strict_uaccess_fixups/patch_v1&id=16644d9460fc6531456cf510d5efc57f89e5cd34 (If you and/or Tony use it, take out the uaccess stuff -- it's not useful for what you're doing, and I should have stuck that in a separate patch in the first place.) --Andy ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCHV5 3/3] x86, ras: Add __mcsafe_copy() function to recover from machine checks 2015-12-27 13:25 ` Andy Lutomirski @ 2015-12-27 13:33 ` Borislav Petkov 2015-12-27 13:40 ` Andy Lutomirski 2015-12-27 19:04 ` Dan Williams 0 siblings, 2 replies; 27+ messages in thread From: Borislav Petkov @ 2015-12-27 13:33 UTC (permalink / raw) To: Andy Lutomirski Cc: Tony Luck, linux-nvdimm, X86 ML, elliott, linux-mm, Andrew Morton, Williams, Dan J, Ingo Molnar, linux-kernel On Sun, Dec 27, 2015 at 05:25:45AM -0800, Andy Lutomirski wrote: > That could significantly bloat the kernel image. Yeah, we probably should build an allyesconfig and see how big __ex_table is and compute how much actually that bloat would be, because... > Anyway, the bit 31 game isn't so bad IMO because it's localized to the > extable macros and the extable reader, whereas the bit 63 thing is all > tangled up with the __mcsafe_copy thing, and that's just the first > user of a more general mechanism. > > Did you see this: > > https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=strict_uaccess_fixups/patch_v1&id=16644d9460fc6531456cf510d5efc57f89e5cd34 ... the problem this has is that you have 4 classes, AFAICT. And since we're talking about a generic mechanism, the moment the 4 classes are not enough, this new scheme fails. I'm just saying... 4 classes are probably more than enough but we don't know. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCHV5 3/3] x86, ras: Add __mcsafe_copy() function to recover from machine checks 2015-12-27 13:33 ` Borislav Petkov @ 2015-12-27 13:40 ` Andy Lutomirski 2015-12-27 19:04 ` Dan Williams 1 sibling, 0 replies; 27+ messages in thread From: Andy Lutomirski @ 2015-12-27 13:40 UTC (permalink / raw) To: Borislav Petkov Cc: Tony Luck, linux-nvdimm, X86 ML, elliott, linux-mm, Andrew Morton, Williams, Dan J, Ingo Molnar, linux-kernel On Sun, Dec 27, 2015 at 5:33 AM, Borislav Petkov <bp@alien8.de> wrote: > On Sun, Dec 27, 2015 at 05:25:45AM -0800, Andy Lutomirski wrote: >> That could significantly bloat the kernel image. > > Yeah, we probably should build an allyesconfig and see how big > __ex_table is and compute how much actually that bloat would be, > because... > >> Anyway, the bit 31 game isn't so bad IMO because it's localized to the >> extable macros and the extable reader, whereas the bit 63 thing is all >> tangled up with the __mcsafe_copy thing, and that's just the first >> user of a more general mechanism. >> >> Did you see this: >> >> https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=strict_uaccess_fixups/patch_v1&id=16644d9460fc6531456cf510d5efc57f89e5cd34 > > ... the problem this has is that you have 4 classes, AFAICT. And since > we're talking about a generic mechanism, the moment the 4 classes are > not enough, this new scheme fails. > > I'm just saying... > > 4 classes are probably more than enough but we don't know. On whatever kernel I just built (defconfig-ish), __ex_table is about 12kB, so the bloat would be 6 kB. In any case, I can think of a total of three useful classes: 0: normal 1: extended (indirect block with fanciness, as I proposed) 2: uaccess (some day, not yet) For rare future things, we could shoehorn them into the extended version's indirect block. Or someone could think of a clever encoding that makes this work better. --Andy ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCHV5 3/3] x86, ras: Add __mcsafe_copy() function to recover from machine checks 2015-12-27 13:33 ` Borislav Petkov 2015-12-27 13:40 ` Andy Lutomirski @ 2015-12-27 19:04 ` Dan Williams 1 sibling, 0 replies; 27+ messages in thread From: Dan Williams @ 2015-12-27 19:04 UTC (permalink / raw) To: Borislav Petkov Cc: Andy Lutomirski, Tony Luck, linux-nvdimm, X86 ML, elliott, linux-mm, Andrew Morton, Ingo Molnar, linux-kernel On Sun, Dec 27, 2015 at 5:33 AM, Borislav Petkov <bp@alien8.de> wrote: > On Sun, Dec 27, 2015 at 05:25:45AM -0800, Andy Lutomirski wrote: >> That could significantly bloat the kernel image. > > Yeah, we probably should build an allyesconfig and see how big > __ex_table is and compute how much actually that bloat would be, > because... > >> Anyway, the bit 31 game isn't so bad IMO because it's localized to the >> extable macros and the extable reader, whereas the bit 63 thing is all >> tangled up with the __mcsafe_copy thing, and that's just the first >> user of a more general mechanism. >> >> Did you see this: >> >> https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=strict_uaccess_fixups/patch_v1&id=16644d9460fc6531456cf510d5efc57f89e5cd34 > > ... the problem this has is that you have 4 classes, AFAICT. And since > we're talking about a generic mechanism, the moment the 4 classes are > not enough, this new scheme fails. > > I'm just saying... > > 4 classes are probably more than enough but we don't know. Then we add support for more than 4 when/if the time comes... ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCHV5 3/3] x86, ras: Add __mcsafe_copy() function to recover from machine checks 2015-12-27 6:57 ` Tony Luck 2015-12-27 10:09 ` Borislav Petkov @ 2015-12-27 12:18 ` Andy Lutomirski 2015-12-30 23:32 ` Tony Luck 1 sibling, 1 reply; 27+ messages in thread From: Andy Lutomirski @ 2015-12-27 12:18 UTC (permalink / raw) To: Tony Luck Cc: Borislav Petkov, linux-nvdimm, X86 ML, elliott, linux-mm, Andrew Morton, Williams, Dan J, Ingo Molnar, linux-kernel On Sat, Dec 26, 2015 at 10:57 PM, Tony Luck <tony.luck@gmail.com> wrote: > On Sat, Dec 26, 2015 at 6:16 PM, Andy Lutomirski <luto@amacapital.net> wrote: >>>> We could make one of them 31-bits (since even an "allyesconfig" kernel >>>> is still much smaller than a gigabyte) to free a bit for a flag. But there >>>> are those external tools to pre-sort exception tables that would all >>>> need to be fixed too. >> >> Wait, why? The external tools sort by source address, and we'd >> squeeze the flag into the target address, no? > > I was thinking that we'd need to recompute the fixup when we move > the entry to its new sorted location. So that: > > ex_fixup_addr(const struct exception_table_entry *x) > { > return (unsigned long)&x->fixup + x->fixup; > } > > will get the right value. Maybe this would still work out > if the fixup is a 31-bit value plus a flag, but the external > tool thinks it is a 32-bit value? I'd have to ponder that. I think I can save you some pondering. This old patch gives two flag bits. Feel free to borrow the patch, but you'll probably want to change the _EXTABLE_CLASS_XYZ macros: https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=strict_uaccess_fixups/patch_v1&id=16644d9460fc6531456cf510d5efc57f89e5cd34 --Andy ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCHV5 3/3] x86, ras: Add __mcsafe_copy() function to recover from machine checks 2015-12-27 12:18 ` Andy Lutomirski @ 2015-12-30 23:32 ` Tony Luck 2015-12-31 20:30 ` Tony Luck 0 siblings, 1 reply; 27+ messages in thread From: Tony Luck @ 2015-12-30 23:32 UTC (permalink / raw) To: Andy Lutomirski Cc: Borislav Petkov, linux-nvdimm, X86 ML, elliott, linux-mm, Andrew Morton, Williams, Dan J, Ingo Molnar, linux-kernel On Sun, Dec 27, 2015 at 4:18 AM, Andy Lutomirski <luto@amacapital.net> wrote: > I think I can save you some pondering. This old patch gives two flag > bits. Feel free to borrow the patch, but you'll probably want to > change the _EXTABLE_CLASS_XYZ macros: > > https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=strict_uaccess_fixups/patch_v1&id=16644d9460fc6531456cf510d5efc57f89e5cd34 Thanks! I took that, and some of Boris's changes, and stirred it altogether at: git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git mcsafev6 First commit is just your patch from above (patch wouldn't apply it directly because of other nearby changes, but I think I didn't break it) Second commit pulls the core of fixup_exception() into separate functions for each class Third adds a new class that provides the fault number to the fixup code in regs->ax. Fourth is just a jumble of the rest .. needs to be split into two parts (one for machine check handler, second to add __mcsafe_copy()) Fifth is just a hack because I clearly didn't understand what I was doing in parts 2&3 because my new class shows up as '3' not '1'! Andy: Can you explain the assembler/linker arithmetic for the class? -Tony ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCHV5 3/3] x86, ras: Add __mcsafe_copy() function to recover from machine checks 2015-12-30 23:32 ` Tony Luck @ 2015-12-31 20:30 ` Tony Luck 2015-12-31 21:22 ` Andy Lutomirski 2016-01-01 22:19 ` Tony Luck 0 siblings, 2 replies; 27+ messages in thread From: Tony Luck @ 2015-12-31 20:30 UTC (permalink / raw) To: Andy Lutomirski Cc: Borislav Petkov, linux-nvdimm, X86 ML, elliott, linux-mm, Andrew Morton, Williams, Dan J, Ingo Molnar, linux-kernel On Wed, Dec 30, 2015 at 3:32 PM, Tony Luck <tony.luck@gmail.com> wrote: > Fifth is just a hack because I clearly didn't understand what I was > doing in parts 2&3 because my new class shows up as '3' not '1'! > > Andy: Can you explain the assembler/linker arithmetic for the class? Never mind ... figured it out. The fixup entry in the extable is: label - . + 0x2000000 - BIAS The "label - ." part evaluates to a smallish negative value (because the .fixup section is bundled in towards the end of .text, and the ex_table section comes right after. Then you add 0x20000000 to get a positive number, then *subtract* the BIAS. I'd picked BIAS = 0x40000000 thinking that would show up directly in class bits. But 0x1ffff000 - 0x40000000 is 0xdffff000 so bits 31 & 31 are both set, and this is class3 I switched to BIAS 0xC0000000 ... and now I get class 1 entries (bit31=0, bit30=1). New patch series coming soon. -Tony ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCHV5 3/3] x86, ras: Add __mcsafe_copy() function to recover from machine checks 2015-12-31 20:30 ` Tony Luck @ 2015-12-31 21:22 ` Andy Lutomirski 2016-01-01 22:19 ` Tony Luck 1 sibling, 0 replies; 27+ messages in thread From: Andy Lutomirski @ 2015-12-31 21:22 UTC (permalink / raw) To: Tony Luck Cc: Robert, Borislav Petkov, linux-mm, X86 ML, Andrew Morton, Williams, Dan J, Ingo Molnar, linux-kernel, linux-nvdimm On Jan 1, 2016 4:30 AM, "Tony Luck" <tony.luck@gmail.com> wrote: > > On Wed, Dec 30, 2015 at 3:32 PM, Tony Luck <tony.luck@gmail.com> wrote: > > Fifth is just a hack because I clearly didn't understand what I was > > doing in parts 2&3 because my new class shows up as '3' not '1'! > > > > Andy: Can you explain the assembler/linker arithmetic for the class? > > Never mind ... figured it out. > > The fixup entry in the extable is: > > label - . + 0x2000000 - BIAS > > The "label - ." part evaluates to a smallish negative value (because > the .fixup section is bundled in towards the end of .text, and the > ex_table section comes right after. > > Then you add 0x20000000 to get a positive number, then *subtract* > the BIAS. I'd picked BIAS = 0x40000000 thinking that would show > up directly in class bits. But 0x1ffff000 - 0x40000000 is 0xdffff000 > so bits 31 & 31 are both set, and this is class3 > > I switched to BIAS 0xC0000000 ... and now I get class 1 entries > (bit31=0, bit30=1). > > New patch series coming soon. That all sounds correct. You could also just to s/UACCESS/INDIRECT/ or whatever and leave using the next bit for whoever does the uaccess part, too. After all, introducing the "uaccess" class without actually implementing it isn't very useful. > > -Tony ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCHV5 3/3] x86, ras: Add __mcsafe_copy() function to recover from machine checks 2015-12-31 20:30 ` Tony Luck 2015-12-31 21:22 ` Andy Lutomirski @ 2016-01-01 22:19 ` Tony Luck 2016-01-03 3:40 ` Andy Lutomirski 1 sibling, 1 reply; 27+ messages in thread From: Tony Luck @ 2016-01-01 22:19 UTC (permalink / raw) To: Andy Lutomirski Cc: Borislav Petkov, linux-nvdimm, X86 ML, elliott, linux-mm, Andrew Morton, Williams, Dan J, Ingo Molnar, linux-kernel Somehow this didn't get sent ... found it in the "Drafts" folder. But it's rubbish, skip to the bottom. On Thu, Dec 31, 2015 at 12:30 PM, Tony Luck <tony.luck@gmail.com> wrote: > I switched to BIAS 0xC0000000 ... and now I should get class 1 entries > (bit31=0, bit30=1). > > New patch series coming soon. Or not :-( arch/x86/lib/lib.a(memcpy_64.o):(__ex_table+0x4): relocation truncated to fit: R_X86_64_PC32 against `.fixup' arch/x86/lib/lib.a(memcpy_64.o):(__ex_table+0xc): relocation truncated to fit: R_X86_64_PC32 against `.fixup' ... I guess it was something like this that made you do the 0x20000000 and subtract the BIAS? I have a bad feeling that we may not really have four classes, just three: 00: no funny arithmetic 10: BIAS = 0x80000000 ... doesn't trigger truncation warning because sign bit is set 11: BIAS = 0x40000000 ... ditto 01: BIAS = ? ... Is there some magic value for BIAS that gets this? --- end of Draft ... now to the real bit Not sure why I was hung up on *subtracting* values to get the desired class bits. Just blindly copying the initial case from your patch? If you can't get from A to B one way, try going around the other direction. Subtracting 0xC0000000 is the same as adding 0x40000000 (when playing with u32 values). That doesn't upset the linker. I rebased: git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git mcsafev6 still needs a little cleanup, but it all works, and seems to be a much cleaner approach. So clean that I wonder whether I really need the CONFIG_MCE_KERNEL_RECOVERY any more?? The only place it is used now is around the __mcsafe_copy() -Tony ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCHV5 3/3] x86, ras: Add __mcsafe_copy() function to recover from machine checks 2016-01-01 22:19 ` Tony Luck @ 2016-01-03 3:40 ` Andy Lutomirski 0 siblings, 0 replies; 27+ messages in thread From: Andy Lutomirski @ 2016-01-03 3:40 UTC (permalink / raw) To: Tony Luck Cc: Borislav Petkov, linux-nvdimm, X86 ML, elliott, linux-mm, Andrew Morton, Williams, Dan J, Ingo Molnar, linux-kernel On Fri, Jan 1, 2016 at 2:19 PM, Tony Luck <tony.luck@gmail.com> wrote: > Somehow this didn't get sent ... found it in the "Drafts" folder. But > it's rubbish, skip to the > bottom. > > On Thu, Dec 31, 2015 at 12:30 PM, Tony Luck <tony.luck@gmail.com> wrote: >> I switched to BIAS 0xC0000000 ... and now I should get class 1 entries >> (bit31=0, bit30=1). >> >> New patch series coming soon. > > Or not :-( > > arch/x86/lib/lib.a(memcpy_64.o):(__ex_table+0x4): relocation truncated > to fit: R_X86_64_PC32 against `.fixup' > arch/x86/lib/lib.a(memcpy_64.o):(__ex_table+0xc): relocation truncated > to fit: R_X86_64_PC32 against `.fixup' > ... > > I guess it was something like this that made you do the 0x20000000 and > subtract the BIAS? > > I have a bad feeling that we may not really have four classes, just three: > > 00: no funny arithmetic > 10: BIAS = 0x80000000 ... doesn't trigger truncation warning because > sign bit is set > 11: BIAS = 0x40000000 ... ditto > 01: BIAS = ? ... Is there some magic value for BIAS that gets this? > > --- end of Draft ... now to the real bit > > Not sure why I was hung up on *subtracting* values to get the desired > class bits. Just > blindly copying the initial case from your patch? > > If you can't get from A to B one way, try going around the other > direction. Subtracting > 0xC0000000 is the same as adding 0x40000000 (when playing with u32 values). > That doesn't upset the linker. > > I rebased: > git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git mcsafev6 > > still needs a little cleanup, but it all works, and seems to be a much > cleaner approach. So clean that I wonder whether I really need > the CONFIG_MCE_KERNEL_RECOVERY any more?? The only > place it is used now is around the __mcsafe_copy() > Looks nice! It might a bit clearer if you rename fix_class2 to fix_class_ex, etc, and then use the C99 syntax for the table: allclasses ... = { [EXTABLE_CLASS_WHATEVER >> 30] = fix_class_whatever, ... }; you could try "[extable_class(EXTABLE_CLASS_WHATEVER)] = fix_class_whatever" Maybe rename EXTABLE_CLASS_FAULT to EXTABLE_CLASS_FAULT_OR_MC? I might still attack this code later to add the indirect fixup idea rather than just returning the fault number in EAX. --Andy ^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2016-01-03 3:40 UTC | newest] Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-12-24 20:54 [PATCHV4 0/3] Machine check recovery when kernel accesses poison Tony Luck 2015-12-16 1:29 ` [PATCHV4 1/3] x86, ras: Add new infrastructure for machine check fixup tables Tony Luck 2015-12-16 1:29 ` [PATCHV4 2/3] x86, ras: Extend machine check recovery code to annotated ring0 areas Tony Luck 2015-12-16 1:30 ` [PATCHV4 3/3] x86, ras: Add __mcsafe_copy() function to recover from machine checks Tony Luck 2015-12-24 21:46 ` Borislav Petkov 2015-12-16 1:30 ` [PATCHV5 " Tony Luck 2015-12-25 11:49 ` Borislav Petkov 2015-12-25 20:05 ` Luck, Tony 2015-12-26 10:32 ` Borislav Petkov 2015-12-26 14:54 ` Andy Lutomirski 2015-12-27 2:08 ` Tony Luck 2015-12-27 2:15 ` Andy Lutomirski 2015-12-27 2:16 ` Andy Lutomirski 2015-12-27 6:57 ` Tony Luck 2015-12-27 10:09 ` Borislav Petkov 2015-12-27 12:19 ` Andy Lutomirski 2015-12-27 13:17 ` Boris Petkov 2015-12-27 13:25 ` Andy Lutomirski 2015-12-27 13:33 ` Borislav Petkov 2015-12-27 13:40 ` Andy Lutomirski 2015-12-27 19:04 ` Dan Williams 2015-12-27 12:18 ` Andy Lutomirski 2015-12-30 23:32 ` Tony Luck 2015-12-31 20:30 ` Tony Luck 2015-12-31 21:22 ` Andy Lutomirski 2016-01-01 22:19 ` Tony Luck 2016-01-03 3:40 ` Andy Lutomirski
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).