* [PATCH] Linux VM workaround for Knights Landing A/D leak @ 2016-06-14 15:58 Lukasz Anaczkowski 2016-06-14 16:31 ` kbuild test robot ` (4 more replies) 0 siblings, 5 replies; 38+ messages in thread From: Lukasz Anaczkowski @ 2016-06-14 15:58 UTC (permalink / raw) To: linux-kernel, linux-mm, tglx, mingo, dave.hansen, ak, kirill.shutemov, mhocko, akpm, hpa Cc: lukasz.anaczkowski, harish.srinivasappa, lukasz.odzioba From: Andi Kleen <ak@linux.intel.com> Knights Landing has a issue that a thread setting A or D bits may not do so atomically against checking the present bit. A thread which is going to page fault may still set those bits, even though the present bit was already atomically cleared. This implies that when the kernel clears present atomically, some time later the supposed to be zero entry could be corrupted with stray A or D bits. Since the PTE could be already used for storing a swap index, or a NUMA migration index, this cannot be tolerated. Most of the time the kernel detects the problem, but in some rare cases it may not. This patch enforces that the page unmap path in vmscan/direct reclaim always flushes other CPUs after clearing each page, and also clears the PTE again after the flush. For reclaim this brings the performance back to before Mel's flushing changes, but for unmap it disables batching. This makes sure any leaked A/D bits are immediately cleared before the entry is used for something else. Any parallel faults that check for entry is zero may loop, but they should eventually recover after the entry is written. Also other users may spin in the page table lock until we "fixed" the PTE. This is ensured by always taking the page table lock even for the swap cache case. Previously this was only done on architectures with non atomic PTE accesses (such as 32bit PTE), but now it is also done when this bug workaround is active. I audited apply_pte_range and other users of arch_enter_lazy... and they seem to all not clear the present bit. Right now the extra flush is done in the low level architecture code, while the higher level code still does batched TLB flush. This means there is always one extra unnecessary TLB flush now. As a followon optimization this could be avoided by telling the callers that the flush already happenend. Signed-off-by: Andi Kleen <ak@linux.intel.com> --- arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/hugetlb.h | 9 ++++++++- arch/x86/include/asm/pgtable.h | 5 +++++ arch/x86/include/asm/pgtable_64.h | 6 ++++++ arch/x86/kernel/cpu/intel.c | 10 ++++++++++ arch/x86/mm/tlb.c | 20 ++++++++++++++++++++ include/linux/mm.h | 4 ++++ mm/memory.c | 3 ++- 8 files changed, 56 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 4a41348..2c48011 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -303,6 +303,7 @@ #define X86_BUG_SYSRET_SS_ATTRS X86_BUG(8) /* SYSRET doesn't fix up SS attrs */ #define X86_BUG_NULL_SEG X86_BUG(9) /* Nulling a selector preserves the base */ #define X86_BUG_SWAPGS_FENCE X86_BUG(10) /* SWAPGS without input dep on GS */ +#define X86_BUG_PTE_LEAK X86_BUG(11) /* PTE may leak A/D bits after clear */ #ifdef CONFIG_X86_32 diff --git a/arch/x86/include/asm/hugetlb.h b/arch/x86/include/asm/hugetlb.h index 3a10616..58e1ca9 100644 --- a/arch/x86/include/asm/hugetlb.h +++ b/arch/x86/include/asm/hugetlb.h @@ -41,10 +41,17 @@ static inline void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, set_pte_at(mm, addr, ptep, pte); } +extern void fix_pte_leak(struct mm_struct *mm, unsigned long addr, + pte_t *ptep); + static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { - return ptep_get_and_clear(mm, addr, ptep); + pte_t pte = ptep_get_and_clear(mm, addr, ptep); + + if (boot_cpu_has_bug(X86_BUG_PTE_LEAK)) + fix_pte_leak(mm, addr, ptep); + return pte; } static inline void huge_ptep_clear_flush(struct vm_area_struct *vma, diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 1a27396..9769355 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -794,11 +794,16 @@ extern int ptep_test_and_clear_young(struct vm_area_struct *vma, extern int ptep_clear_flush_young(struct vm_area_struct *vma, unsigned long address, pte_t *ptep); +extern void fix_pte_leak(struct mm_struct *mm, unsigned long addr, + pte_t *ptep); + #define __HAVE_ARCH_PTEP_GET_AND_CLEAR static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { pte_t pte = native_ptep_get_and_clear(ptep); + if (boot_cpu_has_bug(X86_BUG_PTE_LEAK)) + fix_pte_leak(mm, addr, ptep); pte_update(mm, addr, ptep); return pte; } diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h index 2ee7811..6fa4079 100644 --- a/arch/x86/include/asm/pgtable_64.h +++ b/arch/x86/include/asm/pgtable_64.h @@ -178,6 +178,12 @@ extern void cleanup_highmap(void); extern void init_extra_mapping_uc(unsigned long phys, unsigned long size); extern void init_extra_mapping_wb(unsigned long phys, unsigned long size); +#define ARCH_HAS_NEEDS_SWAP_PTL 1 +static inline bool arch_needs_swap_ptl(void) +{ + return boot_cpu_has_bug(X86_BUG_PTE_LEAK); +} + #endif /* !__ASSEMBLY__ */ #endif /* _ASM_X86_PGTABLE_64_H */ diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index 6e2ffbe..f499513 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -181,6 +181,16 @@ static void early_init_intel(struct cpuinfo_x86 *c) } } + if (c->x86_model == 87) { + static bool printed; + + if (!printed) { + pr_info("Enabling PTE leaking workaround\n"); + printed = true; + } + set_cpu_bug(c, X86_BUG_PTE_LEAK); + } + /* * Intel Quark Core DevMan_001.pdf section 6.4.11 * "The operating system also is required to invalidate (i.e., flush) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 5643fd0..3d54488 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -468,4 +468,24 @@ static int __init create_tlb_single_page_flush_ceiling(void) } late_initcall(create_tlb_single_page_flush_ceiling); +/* + * Workaround for KNL issue: + * + * A thread that is going to page fault due to P=0, may still + * non atomically set A or D bits, which could corrupt swap entries. + * Always flush the other CPUs and clear the PTE again to avoid + * this leakage. We are excluded using the pagetable lock. + */ + +void fix_pte_leak(struct mm_struct *mm, unsigned long addr, pte_t *ptep) +{ + if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids) { + trace_tlb_flush(TLB_LOCAL_SHOOTDOWN, TLB_FLUSH_ALL); + flush_tlb_others(mm_cpumask(mm), mm, addr, + addr + PAGE_SIZE); + mb(); + set_pte(ptep, __pte(0)); + } +} + #endif /* CONFIG_SMP */ diff --git a/include/linux/mm.h b/include/linux/mm.h index 5df5feb..5c80fe09 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2404,6 +2404,10 @@ static inline bool debug_guardpage_enabled(void) { return false; } static inline bool page_is_guard(struct page *page) { return false; } #endif /* CONFIG_DEBUG_PAGEALLOC */ +#ifndef ARCH_HAS_NEEDS_SWAP_PTL +static inline bool arch_needs_swap_ptl(void) { return false; } +#endif + #if MAX_NUMNODES > 1 void __init setup_nr_node_ids(void); #else diff --git a/mm/memory.c b/mm/memory.c index 15322b7..0d6ef39 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1960,7 +1960,8 @@ static inline int pte_unmap_same(struct mm_struct *mm, pmd_t *pmd, { int same = 1; #if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT) - if (sizeof(pte_t) > sizeof(unsigned long)) { + if (arch_needs_swap_ptl() || + sizeof(pte_t) > sizeof(unsigned long)) { spinlock_t *ptl = pte_lockptr(mm, pmd); spin_lock(ptl); same = pte_same(*page_table, orig_pte); -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: [PATCH] Linux VM workaround for Knights Landing A/D leak 2016-06-14 15:58 [PATCH] Linux VM workaround for Knights Landing A/D leak Lukasz Anaczkowski @ 2016-06-14 16:31 ` kbuild test robot 2016-06-14 16:47 ` Nadav Amit ` (3 subsequent siblings) 4 siblings, 0 replies; 38+ messages in thread From: kbuild test robot @ 2016-06-14 16:31 UTC (permalink / raw) To: Lukasz Anaczkowski Cc: kbuild-all, linux-kernel, linux-mm, tglx, mingo, dave.hansen, ak, kirill.shutemov, mhocko, akpm, hpa, lukasz.anaczkowski, harish.srinivasappa, lukasz.odzioba [-- Attachment #1: Type: text/plain, Size: 1398 bytes --] Hi, [auto build test ERROR on v4.7-rc3] [also build test ERROR on next-20160614] [cannot apply to tip/x86/core] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Lukasz-Anaczkowski/Linux-VM-workaround-for-Knights-Landing-A-D-leak/20160615-000610 config: i386-alldefconfig (attached as .config) compiler: gcc-6 (Debian 6.1.1-1) 6.1.1 20160430 reproduce: # save the attached .config to linux build tree make ARCH=i386 All errors (new ones prefixed by >>): mm/built-in.o: In function `unmap_page_range': (.text+0x1e9e8): undefined reference to `fix_pte_leak' mm/built-in.o: In function `change_protection_range': >> mprotect.c:(.text+0x25578): undefined reference to `fix_pte_leak' mm/built-in.o: In function `move_page_tables': (.text+0x25d81): undefined reference to `fix_pte_leak' mm/built-in.o: In function `vunmap_page_range': vmalloc.c:(.text+0x28419): undefined reference to `fix_pte_leak' mm/built-in.o: In function `ptep_clear_flush': (.text+0x2a7b3): undefined reference to `fix_pte_leak' mm/built-in.o:madvise.c:(.text+0x2b4b0): more undefined references to `fix_pte_leak' follow --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation [-- Attachment #2: .config.gz --] [-- Type: application/octet-stream, Size: 9935 bytes --] ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] Linux VM workaround for Knights Landing A/D leak 2016-06-14 15:58 [PATCH] Linux VM workaround for Knights Landing A/D leak Lukasz Anaczkowski 2016-06-14 16:31 ` kbuild test robot @ 2016-06-14 16:47 ` Nadav Amit 2016-06-14 16:54 ` Anaczkowski, Lukasz ` (2 more replies) 2016-06-14 16:58 ` kbuild test robot ` (2 subsequent siblings) 4 siblings, 3 replies; 38+ messages in thread From: Nadav Amit @ 2016-06-14 16:47 UTC (permalink / raw) To: Lukasz Anaczkowski Cc: LKML, linux-mm, Thomas Gleixner, Ingo Molnar, Dave Hansen, ak, kirill.shutemov, mhocko, Andrew Morton, H. Peter Anvin, harish.srinivasappa, lukasz.odzioba Lukasz Anaczkowski <lukasz.anaczkowski@intel.com> wrote: > From: Andi Kleen <ak@linux.intel.com> > +void fix_pte_leak(struct mm_struct *mm, unsigned long addr, pte_t *ptep) > +{ Here there should be a call to smp_mb__after_atomic() to synchronize with switch_mm. I submitted a similar patch, which is still pending (hint). > + if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids) { > + trace_tlb_flush(TLB_LOCAL_SHOOTDOWN, TLB_FLUSH_ALL); > + flush_tlb_others(mm_cpumask(mm), mm, addr, > + addr + PAGE_SIZE); > + mb(); > + set_pte(ptep, __pte(0)); > + } > +} Regards, Nadav ^ permalink raw reply [flat|nested] 38+ messages in thread
* RE: [PATCH] Linux VM workaround for Knights Landing A/D leak 2016-06-14 16:47 ` Nadav Amit @ 2016-06-14 16:54 ` Anaczkowski, Lukasz 2016-06-14 17:01 ` [PATCH v2] " Lukasz Anaczkowski 2016-06-14 17:18 ` [PATCH] " Dave Hansen 2 siblings, 0 replies; 38+ messages in thread From: Anaczkowski, Lukasz @ 2016-06-14 16:54 UTC (permalink / raw) To: Nadav Amit Cc: LKML, linux-mm, Thomas Gleixner, Ingo Molnar, Dave Hansen, ak, kirill.shutemov, mhocko, Andrew Morton, H. Peter Anvin, Srinivasappa, Harish, Odzioba, Lukasz From: Nadav Amit [mailto:nadav.amit@gmail.com] Sent: Tuesday, June 14, 2016 6:48 PM > Lukasz Anaczkowski <lukasz.anaczkowski@intel.com> wrote: >> From: Andi Kleen <ak@linux.intel.com> >> +void fix_pte_leak(struct mm_struct *mm, unsigned long addr, pte_t *ptep) >> +{ > Here there should be a call to smp_mb__after_atomic() to synchronize with > switch_mm. I submitted a similar patch, which is still pending (hint). Thanks, Nadav! I'll add this and re-submit the patch. Cheers, Lukasz ^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH v2] Linux VM workaround for Knights Landing A/D leak 2016-06-14 16:47 ` Nadav Amit 2016-06-14 16:54 ` Anaczkowski, Lukasz @ 2016-06-14 17:01 ` Lukasz Anaczkowski 2016-06-14 17:24 ` Dave Hansen ` (3 more replies) 2016-06-14 17:18 ` [PATCH] " Dave Hansen 2 siblings, 4 replies; 38+ messages in thread From: Lukasz Anaczkowski @ 2016-06-14 17:01 UTC (permalink / raw) To: linux-kernel, linux-mm, tglx, mingo, dave.hansen, ak, kirill.shutemov, mhocko, akpm, hpa Cc: lukasz.anaczkowski, harish.srinivasappa, lukasz.odzioba, grzegorz.andrejczuk, lukasz.daniluk From: Andi Kleen <ak@linux.intel.com> Knights Landing has a issue that a thread setting A or D bits may not do so atomically against checking the present bit. A thread which is going to page fault may still set those bits, even though the present bit was already atomically cleared. This implies that when the kernel clears present atomically, some time later the supposed to be zero entry could be corrupted with stray A or D bits. Since the PTE could be already used for storing a swap index, or a NUMA migration index, this cannot be tolerated. Most of the time the kernel detects the problem, but in some rare cases it may not. This patch enforces that the page unmap path in vmscan/direct reclaim always flushes other CPUs after clearing each page, and also clears the PTE again after the flush. For reclaim this brings the performance back to before Mel's flushing changes, but for unmap it disables batching. This makes sure any leaked A/D bits are immediately cleared before the entry is used for something else. Any parallel faults that check for entry is zero may loop, but they should eventually recover after the entry is written. Also other users may spin in the page table lock until we "fixed" the PTE. This is ensured by always taking the page table lock even for the swap cache case. Previously this was only done on architectures with non atomic PTE accesses (such as 32bit PTE), but now it is also done when this bug workaround is active. I audited apply_pte_range and other users of arch_enter_lazy... and they seem to all not clear the present bit. Right now the extra flush is done in the low level architecture code, while the higher level code still does batched TLB flush. This means there is always one extra unnecessary TLB flush now. As a followon optimization this could be avoided by telling the callers that the flush already happenend. v2 (Lukasz Anaczkowski): () added call to smp_mb__after_atomic() to synchornize with switch_mm, based on Nadav's comment () fixed compilation breakage Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Lukasz Anaczkowski <lukasz.anaczkowski@intel.com> --- arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/hugetlb.h | 9 ++++++++- arch/x86/include/asm/pgtable.h | 5 +++++ arch/x86/include/asm/pgtable_64.h | 6 ++++++ arch/x86/kernel/cpu/intel.c | 10 ++++++++++ arch/x86/mm/tlb.c | 22 ++++++++++++++++++++++ include/linux/mm.h | 4 ++++ mm/memory.c | 3 ++- 8 files changed, 58 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 4a41348..2c48011 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -303,6 +303,7 @@ #define X86_BUG_SYSRET_SS_ATTRS X86_BUG(8) /* SYSRET doesn't fix up SS attrs */ #define X86_BUG_NULL_SEG X86_BUG(9) /* Nulling a selector preserves the base */ #define X86_BUG_SWAPGS_FENCE X86_BUG(10) /* SWAPGS without input dep on GS */ +#define X86_BUG_PTE_LEAK X86_BUG(11) /* PTE may leak A/D bits after clear */ #ifdef CONFIG_X86_32 diff --git a/arch/x86/include/asm/hugetlb.h b/arch/x86/include/asm/hugetlb.h index 3a10616..58e1ca9 100644 --- a/arch/x86/include/asm/hugetlb.h +++ b/arch/x86/include/asm/hugetlb.h @@ -41,10 +41,17 @@ static inline void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, set_pte_at(mm, addr, ptep, pte); } +extern void fix_pte_leak(struct mm_struct *mm, unsigned long addr, + pte_t *ptep); + static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { - return ptep_get_and_clear(mm, addr, ptep); + pte_t pte = ptep_get_and_clear(mm, addr, ptep); + + if (boot_cpu_has_bug(X86_BUG_PTE_LEAK)) + fix_pte_leak(mm, addr, ptep); + return pte; } static inline void huge_ptep_clear_flush(struct vm_area_struct *vma, diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 1a27396..9769355 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -794,11 +794,16 @@ extern int ptep_test_and_clear_young(struct vm_area_struct *vma, extern int ptep_clear_flush_young(struct vm_area_struct *vma, unsigned long address, pte_t *ptep); +extern void fix_pte_leak(struct mm_struct *mm, unsigned long addr, + pte_t *ptep); + #define __HAVE_ARCH_PTEP_GET_AND_CLEAR static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { pte_t pte = native_ptep_get_and_clear(ptep); + if (boot_cpu_has_bug(X86_BUG_PTE_LEAK)) + fix_pte_leak(mm, addr, ptep); pte_update(mm, addr, ptep); return pte; } diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h index 2ee7811..6fa4079 100644 --- a/arch/x86/include/asm/pgtable_64.h +++ b/arch/x86/include/asm/pgtable_64.h @@ -178,6 +178,12 @@ extern void cleanup_highmap(void); extern void init_extra_mapping_uc(unsigned long phys, unsigned long size); extern void init_extra_mapping_wb(unsigned long phys, unsigned long size); +#define ARCH_HAS_NEEDS_SWAP_PTL 1 +static inline bool arch_needs_swap_ptl(void) +{ + return boot_cpu_has_bug(X86_BUG_PTE_LEAK); +} + #endif /* !__ASSEMBLY__ */ #endif /* _ASM_X86_PGTABLE_64_H */ diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index 6e2ffbe..f499513 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -181,6 +181,16 @@ static void early_init_intel(struct cpuinfo_x86 *c) } } + if (c->x86_model == 87) { + static bool printed; + + if (!printed) { + pr_info("Enabling PTE leaking workaround\n"); + printed = true; + } + set_cpu_bug(c, X86_BUG_PTE_LEAK); + } + /* * Intel Quark Core DevMan_001.pdf section 6.4.11 * "The operating system also is required to invalidate (i.e., flush) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 5643fd0..9b4c575 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -469,3 +469,25 @@ static int __init create_tlb_single_page_flush_ceiling(void) late_initcall(create_tlb_single_page_flush_ceiling); #endif /* CONFIG_SMP */ + +/* + * Workaround for KNL issue: + * + * A thread that is going to page fault due to P=0, may still + * non atomically set A or D bits, which could corrupt swap entries. + * Always flush the other CPUs and clear the PTE again to avoid + * this leakage. We are excluded using the pagetable lock. + */ + +void fix_pte_leak(struct mm_struct *mm, unsigned long addr, pte_t *ptep) +{ + smp_mb__after_atomic(); + if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids) { + trace_tlb_flush(TLB_LOCAL_SHOOTDOWN, TLB_FLUSH_ALL); + flush_tlb_others(mm_cpumask(mm), mm, addr, + addr + PAGE_SIZE); + mb(); + set_pte(ptep, __pte(0)); + } +} + diff --git a/include/linux/mm.h b/include/linux/mm.h index 5df5feb..5c80fe09 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2404,6 +2404,10 @@ static inline bool debug_guardpage_enabled(void) { return false; } static inline bool page_is_guard(struct page *page) { return false; } #endif /* CONFIG_DEBUG_PAGEALLOC */ +#ifndef ARCH_HAS_NEEDS_SWAP_PTL +static inline bool arch_needs_swap_ptl(void) { return false; } +#endif + #if MAX_NUMNODES > 1 void __init setup_nr_node_ids(void); #else diff --git a/mm/memory.c b/mm/memory.c index 15322b7..0d6ef39 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1960,7 +1960,8 @@ static inline int pte_unmap_same(struct mm_struct *mm, pmd_t *pmd, { int same = 1; #if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT) - if (sizeof(pte_t) > sizeof(unsigned long)) { + if (arch_needs_swap_ptl() || + sizeof(pte_t) > sizeof(unsigned long)) { spinlock_t *ptl = pte_lockptr(mm, pmd); spin_lock(ptl); same = pte_same(*page_table, orig_pte); -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: [PATCH v2] Linux VM workaround for Knights Landing A/D leak 2016-06-14 17:01 ` [PATCH v2] " Lukasz Anaczkowski @ 2016-06-14 17:24 ` Dave Hansen 2016-06-14 18:34 ` One Thousand Gnomes 2016-06-14 18:10 ` Borislav Petkov ` (2 subsequent siblings) 3 siblings, 1 reply; 38+ messages in thread From: Dave Hansen @ 2016-06-14 17:24 UTC (permalink / raw) To: Lukasz Anaczkowski, linux-kernel, linux-mm, tglx, mingo, ak, kirill.shutemov, mhocko, akpm, hpa Cc: harish.srinivasappa, lukasz.odzioba, grzegorz.andrejczuk, lukasz.daniluk On 06/14/2016 10:01 AM, Lukasz Anaczkowski wrote: > v2 (Lukasz Anaczkowski): > () fixed compilation breakage ... By unconditionally defining the workaround code, even on kernels where there is no chance of ever hitting this bug. I think that's a pretty poor way to do it. Can we please stick this in one of the intel.c files, so it's only present on CPU_SUP_INTEL builds? Which reminds me... > --- a/arch/x86/include/asm/pgtable_64.h > +++ b/arch/x86/include/asm/pgtable_64.h > @@ -178,6 +178,12 @@ extern void cleanup_highmap(void); > extern void init_extra_mapping_uc(unsigned long phys, unsigned long size); > extern void init_extra_mapping_wb(unsigned long phys, unsigned long size); > > +#define ARCH_HAS_NEEDS_SWAP_PTL 1 > +static inline bool arch_needs_swap_ptl(void) > +{ > + return boot_cpu_has_bug(X86_BUG_PTE_LEAK); > +} Does this *REALLY* only affect 64-bit kernels? ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v2] Linux VM workaround for Knights Landing A/D leak 2016-06-14 17:24 ` Dave Hansen @ 2016-06-14 18:34 ` One Thousand Gnomes 2016-06-14 18:54 ` Dave Hansen 0 siblings, 1 reply; 38+ messages in thread From: One Thousand Gnomes @ 2016-06-14 18:34 UTC (permalink / raw) To: Dave Hansen Cc: Lukasz Anaczkowski, linux-kernel, linux-mm, tglx, mingo, ak, kirill.shutemov, mhocko, akpm, hpa, harish.srinivasappa, lukasz.odzioba, grzegorz.andrejczuk, lukasz.daniluk On Tue, 14 Jun 2016 10:24:16 -0700 Dave Hansen <dave.hansen@linux.intel.com> wrote: > On 06/14/2016 10:01 AM, Lukasz Anaczkowski wrote: > > v2 (Lukasz Anaczkowski): > > () fixed compilation breakage > ... > > By unconditionally defining the workaround code, even on kernels where > there is no chance of ever hitting this bug. I think that's a pretty > poor way to do it. > > Can we please stick this in one of the intel.c files, so it's only > present on CPU_SUP_INTEL builds? Can we please make it use alternatives or somesuch so that it just goes away at boot if its not a Knights Landing box ? Alan ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v2] Linux VM workaround for Knights Landing A/D leak 2016-06-14 18:34 ` One Thousand Gnomes @ 2016-06-14 18:54 ` Dave Hansen 2016-06-14 19:19 ` Borislav Petkov 0 siblings, 1 reply; 38+ messages in thread From: Dave Hansen @ 2016-06-14 18:54 UTC (permalink / raw) To: One Thousand Gnomes Cc: Lukasz Anaczkowski, linux-kernel, linux-mm, tglx, mingo, ak, kirill.shutemov, mhocko, akpm, hpa, harish.srinivasappa, lukasz.odzioba, grzegorz.andrejczuk, lukasz.daniluk, Borislav Petkov On 06/14/2016 11:34 AM, One Thousand Gnomes wrote: > On Tue, 14 Jun 2016 10:24:16 -0700 > Dave Hansen <dave.hansen@linux.intel.com> wrote: > >> On 06/14/2016 10:01 AM, Lukasz Anaczkowski wrote: >>> v2 (Lukasz Anaczkowski): >>> () fixed compilation breakage >> ... >> >> By unconditionally defining the workaround code, even on kernels where >> there is no chance of ever hitting this bug. I think that's a pretty >> poor way to do it. >> >> Can we please stick this in one of the intel.c files, so it's only >> present on CPU_SUP_INTEL builds? > > Can we please make it use alternatives or somesuch so that it just goes > away at boot if its not a Knights Landing box ? Lukasz, Borislav suggested using static_cpu_has_bug(), which will do the alternatives patching. It's definitely the right thing to use here. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v2] Linux VM workaround for Knights Landing A/D leak 2016-06-14 18:54 ` Dave Hansen @ 2016-06-14 19:19 ` Borislav Petkov 2016-06-14 20:20 ` H. Peter Anvin 0 siblings, 1 reply; 38+ messages in thread From: Borislav Petkov @ 2016-06-14 19:19 UTC (permalink / raw) To: Dave Hansen Cc: One Thousand Gnomes, Lukasz Anaczkowski, linux-kernel, linux-mm, tglx, mingo, ak, kirill.shutemov, mhocko, akpm, hpa, harish.srinivasappa, lukasz.odzioba, grzegorz.andrejczuk, lukasz.daniluk On Tue, Jun 14, 2016 at 11:54:24AM -0700, Dave Hansen wrote: > Lukasz, Borislav suggested using static_cpu_has_bug(), which will do the > alternatives patching. It's definitely the right thing to use here. Yeah, either that or do an alternative_call(null_func, fix_pte_peak, X86_BUG_PTE_LEAK, ...) or so and you'll need a dummy function to call on !X86_BUG_PTE_LEAK CPUs. The static_cpu_has_bug() thing should be most likely a penalty of a single JMP (I have to look at the asm) but then since the callers are inlined, you'll have to patch all those places where *ptep_get_and_clear() get inlined. Shouldn't be a big deal still but... "debug-alternative" and a kvm guest should help you there to get a quick idea. HTH. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v2] Linux VM workaround for Knights Landing A/D leak 2016-06-14 19:19 ` Borislav Petkov @ 2016-06-14 20:20 ` H. Peter Anvin 2016-06-14 20:47 ` Borislav Petkov 0 siblings, 1 reply; 38+ messages in thread From: H. Peter Anvin @ 2016-06-14 20:20 UTC (permalink / raw) To: Borislav Petkov, Dave Hansen Cc: One Thousand Gnomes, Lukasz Anaczkowski, linux-kernel, linux-mm, tglx, mingo, ak, kirill.shutemov, mhocko, akpm, harish.srinivasappa, lukasz.odzioba, grzegorz.andrejczuk, lukasz.daniluk On 06/14/16 12:19, Borislav Petkov wrote: > On Tue, Jun 14, 2016 at 11:54:24AM -0700, Dave Hansen wrote: >> Lukasz, Borislav suggested using static_cpu_has_bug(), which will do the >> alternatives patching. It's definitely the right thing to use here. > > Yeah, either that or do an > > alternative_call(null_func, fix_pte_peak, X86_BUG_PTE_LEAK, ...) > > or so and you'll need a dummy function to call on !X86_BUG_PTE_LEAK > CPUs. > > The static_cpu_has_bug() thing should be most likely a penalty > of a single JMP (I have to look at the asm) but then since the > callers are inlined, you'll have to patch all those places where > *ptep_get_and_clear() get inlined. > > Shouldn't be a big deal still but... > > "debug-alternative" and a kvm guest should help you there to get a quick > idea. > static_cpu_has_bug() should turn into 5-byte NOP in the common (bugless) case. -hpa ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v2] Linux VM workaround for Knights Landing A/D leak 2016-06-14 20:20 ` H. Peter Anvin @ 2016-06-14 20:47 ` Borislav Petkov 2016-06-14 20:54 ` H. Peter Anvin 0 siblings, 1 reply; 38+ messages in thread From: Borislav Petkov @ 2016-06-14 20:47 UTC (permalink / raw) To: H. Peter Anvin Cc: Dave Hansen, One Thousand Gnomes, Lukasz Anaczkowski, linux-kernel, linux-mm, tglx, mingo, ak, kirill.shutemov, mhocko, akpm, harish.srinivasappa, lukasz.odzioba, grzegorz.andrejczuk, lukasz.daniluk On Tue, Jun 14, 2016 at 01:20:06PM -0700, H. Peter Anvin wrote: > static_cpu_has_bug() should turn into 5-byte NOP in the common (bugless) > case. Yeah, it does. I looked at the asm. I wasn't 100% sure because I vaguely remember gcc reordering things in some pathological case but I'm most likely remembering wrong because if it were doing that, then the whole nopping out won't work. F'get about it. :) -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v2] Linux VM workaround for Knights Landing A/D leak 2016-06-14 20:47 ` Borislav Petkov @ 2016-06-14 20:54 ` H. Peter Anvin 2016-06-14 21:02 ` Borislav Petkov 0 siblings, 1 reply; 38+ messages in thread From: H. Peter Anvin @ 2016-06-14 20:54 UTC (permalink / raw) To: Borislav Petkov Cc: Dave Hansen, One Thousand Gnomes, Lukasz Anaczkowski, linux-kernel, linux-mm, tglx, mingo, ak, kirill.shutemov, mhocko, akpm, harish.srinivasappa, lukasz.odzioba, grzegorz.andrejczuk, lukasz.daniluk On 06/14/16 13:47, Borislav Petkov wrote: > On Tue, Jun 14, 2016 at 01:20:06PM -0700, H. Peter Anvin wrote: >> static_cpu_has_bug() should turn into 5-byte NOP in the common (bugless) >> case. > > Yeah, it does. I looked at the asm. > > I wasn't 100% sure because I vaguely remember gcc reordering things in > some pathological case but I'm most likely remembering wrong because if > it were doing that, then the whole nopping out won't work. F'get about > it. :) > There was that. It is still possible that we end up with NOP a JMP right before another JMP; we could perhaps make the patching code smarter and see if we have a JMP immediately after. -hpa ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v2] Linux VM workaround for Knights Landing A/D leak 2016-06-14 20:54 ` H. Peter Anvin @ 2016-06-14 21:02 ` Borislav Petkov 2016-06-14 21:08 ` H. Peter Anvin 2016-06-14 21:13 ` H. Peter Anvin 0 siblings, 2 replies; 38+ messages in thread From: Borislav Petkov @ 2016-06-14 21:02 UTC (permalink / raw) To: H. Peter Anvin Cc: Dave Hansen, One Thousand Gnomes, Lukasz Anaczkowski, linux-kernel, linux-mm, tglx, mingo, ak, kirill.shutemov, mhocko, akpm, harish.srinivasappa, lukasz.odzioba, grzegorz.andrejczuk, lukasz.daniluk On Tue, Jun 14, 2016 at 01:54:25PM -0700, H. Peter Anvin wrote: > There was that. It is still possible that we end up with NOP a JMP > right before another JMP; we could perhaps make the patching code > smarter and see if we have a JMP immediately after. Yeah, I still can't get reproduce that reliably - I remember seeing it at some point but then dismissing it for another, higher-prio thing. And now the whole memory is hazy at best. But, you're giving me a great idea right now - I have this kernel disassembler tool which dumps alternative sections already and I could teach it to look for pathological cases around the patching sites and scream. Something for my TODO list when I get a quiet moment. Thanks! -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v2] Linux VM workaround for Knights Landing A/D leak 2016-06-14 21:02 ` Borislav Petkov @ 2016-06-14 21:08 ` H. Peter Anvin 2016-06-14 21:13 ` H. Peter Anvin 1 sibling, 0 replies; 38+ messages in thread From: H. Peter Anvin @ 2016-06-14 21:08 UTC (permalink / raw) To: Borislav Petkov Cc: Dave Hansen, One Thousand Gnomes, Lukasz Anaczkowski, linux-kernel, linux-mm, tglx, mingo, ak, kirill.shutemov, mhocko, akpm, harish.srinivasappa, lukasz.odzioba, grzegorz.andrejczuk, lukasz.daniluk On 06/14/16 14:02, Borislav Petkov wrote: > On Tue, Jun 14, 2016 at 01:54:25PM -0700, H. Peter Anvin wrote: >> There was that. It is still possible that we end up with NOP a JMP >> right before another JMP; we could perhaps make the patching code >> smarter and see if we have a JMP immediately after. > > Yeah, I still can't get reproduce that reliably - I remember seeing it > at some point but then dismissing it for another, higher-prio thing. And > now the whole memory is hazy at best. > > But, you're giving me a great idea right now - I have this kernel > disassembler tool which dumps alternative sections already and I could > teach it to look for pathological cases around the patching sites and > scream. > > Something for my TODO list when I get a quiet moment. > It's not really pathological; the issue is that asm goto() with an unreachable clause after it doesn't tell gcc that a certain code path ought to be linear, so we tell it to fall through. However, if gcc then wants to have a jump there for whatever reason (perhaps it is part of a loop) we end up with a redundant jump, so a patch site followed by a JMP is entirely reasonable. -hpa ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v2] Linux VM workaround for Knights Landing A/D leak 2016-06-14 21:02 ` Borislav Petkov 2016-06-14 21:08 ` H. Peter Anvin @ 2016-06-14 21:13 ` H. Peter Anvin 1 sibling, 0 replies; 38+ messages in thread From: H. Peter Anvin @ 2016-06-14 21:13 UTC (permalink / raw) To: Borislav Petkov Cc: Dave Hansen, One Thousand Gnomes, Lukasz Anaczkowski, linux-kernel, linux-mm, tglx, mingo, ak, kirill.shutemov, mhocko, akpm, harish.srinivasappa, lukasz.odzioba, grzegorz.andrejczuk, lukasz.daniluk On June 14, 2016 2:02:55 PM PDT, Borislav Petkov <bp@alien8.de> wrote: >On Tue, Jun 14, 2016 at 01:54:25PM -0700, H. Peter Anvin wrote: >> There was that. It is still possible that we end up with NOP a JMP >> right before another JMP; we could perhaps make the patching code >> smarter and see if we have a JMP immediately after. > >Yeah, I still can't get reproduce that reliably - I remember seeing it >at some point but then dismissing it for another, higher-prio thing. >And >now the whole memory is hazy at best. > >But, you're giving me a great idea right now - I have this kernel >disassembler tool which dumps alternative sections already and I could >teach it to look for pathological cases around the patching sites and >scream. > >Something for my TODO list when I get a quiet moment. > >Thanks! We talked with the GCC people about always bias asm goto toward the first label even if followed by __builtin_unreachable(). I don't know if that happened; if so we should probably insert the unreachable for those versions of gcc only. -- Sent from my Android device with K-9 Mail. Please excuse brevity and formatting. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v2] Linux VM workaround for Knights Landing A/D leak 2016-06-14 17:01 ` [PATCH v2] " Lukasz Anaczkowski 2016-06-14 17:24 ` Dave Hansen @ 2016-06-14 18:10 ` Borislav Petkov 2016-06-15 13:12 ` Anaczkowski, Lukasz 2016-06-14 18:38 ` Nadav Amit 2016-06-16 15:14 ` [PATCH v3] " Lukasz Anaczkowski 3 siblings, 1 reply; 38+ messages in thread From: Borislav Petkov @ 2016-06-14 18:10 UTC (permalink / raw) To: Lukasz Anaczkowski Cc: linux-kernel, linux-mm, tglx, mingo, dave.hansen, ak, kirill.shutemov, mhocko, akpm, hpa, harish.srinivasappa, lukasz.odzioba, grzegorz.andrejczuk, lukasz.daniluk On Tue, Jun 14, 2016 at 07:01:12PM +0200, Lukasz Anaczkowski wrote: > From: Andi Kleen <ak@linux.intel.com> > > Knights Landing has a issue that a thread setting A or D bits > may not do so atomically against checking the present bit. > A thread which is going to page fault may still set those > bits, even though the present bit was already atomically cleared. > > This implies that when the kernel clears present atomically, > some time later the supposed to be zero entry could be corrupted > with stray A or D bits. > > Since the PTE could be already used for storing a swap index, > or a NUMA migration index, this cannot be tolerated. Most > of the time the kernel detects the problem, but in some > rare cases it may not. > > This patch enforces that the page unmap path in vmscan/direct reclaim > always flushes other CPUs after clearing each page, and also > clears the PTE again after the flush. > > For reclaim this brings the performance back to before Mel's > flushing changes, but for unmap it disables batching. > > This makes sure any leaked A/D bits are immediately cleared before the entry > is used for something else. > > Any parallel faults that check for entry is zero may loop, > but they should eventually recover after the entry is written. > > Also other users may spin in the page table lock until we > "fixed" the PTE. This is ensured by always taking the page table lock > even for the swap cache case. Previously this was only done > on architectures with non atomic PTE accesses (such as 32bit PTE), > but now it is also done when this bug workaround is active. > > I audited apply_pte_range and other users of arch_enter_lazy... > and they seem to all not clear the present bit. > > Right now the extra flush is done in the low level > architecture code, while the higher level code still > does batched TLB flush. This means there is always one extra > unnecessary TLB flush now. As a followon optimization > this could be avoided by telling the callers that > the flush already happenend. > > v2 (Lukasz Anaczkowski): > () added call to smp_mb__after_atomic() to synchornize with > switch_mm, based on Nadav's comment > () fixed compilation breakage > > Signed-off-by: Andi Kleen <ak@linux.intel.com> > Signed-off-by: Lukasz Anaczkowski <lukasz.anaczkowski@intel.com> > --- > arch/x86/include/asm/cpufeatures.h | 1 + > arch/x86/include/asm/hugetlb.h | 9 ++++++++- > arch/x86/include/asm/pgtable.h | 5 +++++ > arch/x86/include/asm/pgtable_64.h | 6 ++++++ > arch/x86/kernel/cpu/intel.c | 10 ++++++++++ > arch/x86/mm/tlb.c | 22 ++++++++++++++++++++++ > include/linux/mm.h | 4 ++++ > mm/memory.c | 3 ++- > 8 files changed, 58 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h > index 4a41348..2c48011 100644 > --- a/arch/x86/include/asm/cpufeatures.h > +++ b/arch/x86/include/asm/cpufeatures.h > @@ -303,6 +303,7 @@ > #define X86_BUG_SYSRET_SS_ATTRS X86_BUG(8) /* SYSRET doesn't fix up SS attrs */ > #define X86_BUG_NULL_SEG X86_BUG(9) /* Nulling a selector preserves the base */ > #define X86_BUG_SWAPGS_FENCE X86_BUG(10) /* SWAPGS without input dep on GS */ > +#define X86_BUG_PTE_LEAK X86_BUG(11) /* PTE may leak A/D bits after clear */ > > > #ifdef CONFIG_X86_32 > diff --git a/arch/x86/include/asm/hugetlb.h b/arch/x86/include/asm/hugetlb.h > index 3a10616..58e1ca9 100644 > --- a/arch/x86/include/asm/hugetlb.h > +++ b/arch/x86/include/asm/hugetlb.h > @@ -41,10 +41,17 @@ static inline void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, > set_pte_at(mm, addr, ptep, pte); > } > > +extern void fix_pte_leak(struct mm_struct *mm, unsigned long addr, > + pte_t *ptep); > + > static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm, > unsigned long addr, pte_t *ptep) > { > - return ptep_get_and_clear(mm, addr, ptep); > + pte_t pte = ptep_get_and_clear(mm, addr, ptep); > + > + if (boot_cpu_has_bug(X86_BUG_PTE_LEAK)) static_cpu_has_bug() > + fix_pte_leak(mm, addr, ptep); > + return pte; > } > > static inline void huge_ptep_clear_flush(struct vm_area_struct *vma, > diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h > index 1a27396..9769355 100644 > --- a/arch/x86/include/asm/pgtable.h > +++ b/arch/x86/include/asm/pgtable.h > @@ -794,11 +794,16 @@ extern int ptep_test_and_clear_young(struct vm_area_struct *vma, > extern int ptep_clear_flush_young(struct vm_area_struct *vma, > unsigned long address, pte_t *ptep); > > +extern void fix_pte_leak(struct mm_struct *mm, unsigned long addr, > + pte_t *ptep); > + > #define __HAVE_ARCH_PTEP_GET_AND_CLEAR > static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, > pte_t *ptep) > { > pte_t pte = native_ptep_get_and_clear(ptep); > + if (boot_cpu_has_bug(X86_BUG_PTE_LEAK)) static_cpu_has_bug() > + fix_pte_leak(mm, addr, ptep); > pte_update(mm, addr, ptep); > return pte; > } > diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h > index 2ee7811..6fa4079 100644 > --- a/arch/x86/include/asm/pgtable_64.h > +++ b/arch/x86/include/asm/pgtable_64.h > @@ -178,6 +178,12 @@ extern void cleanup_highmap(void); > extern void init_extra_mapping_uc(unsigned long phys, unsigned long size); > extern void init_extra_mapping_wb(unsigned long phys, unsigned long size); > > +#define ARCH_HAS_NEEDS_SWAP_PTL 1 > +static inline bool arch_needs_swap_ptl(void) > +{ > + return boot_cpu_has_bug(X86_BUG_PTE_LEAK); static_cpu_has_bug() > +} > + > #endif /* !__ASSEMBLY__ */ > > #endif /* _ASM_X86_PGTABLE_64_H */ > diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c > index 6e2ffbe..f499513 100644 > --- a/arch/x86/kernel/cpu/intel.c > +++ b/arch/x86/kernel/cpu/intel.c > @@ -181,6 +181,16 @@ static void early_init_intel(struct cpuinfo_x86 *c) > } > } > > + if (c->x86_model == 87) { That should be INTEL_FAM6_XEON_PHI_KNL, AFAICT. > + static bool printed; > + > + if (!printed) { > + pr_info("Enabling PTE leaking workaround\n"); > + printed = true; > + } pr_info_once > + set_cpu_bug(c, X86_BUG_PTE_LEAK); > + } > + > /* > * Intel Quark Core DevMan_001.pdf section 6.4.11 > * "The operating system also is required to invalidate (i.e., flush) -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. ^ permalink raw reply [flat|nested] 38+ messages in thread
* RE: [PATCH v2] Linux VM workaround for Knights Landing A/D leak 2016-06-14 18:10 ` Borislav Petkov @ 2016-06-15 13:12 ` Anaczkowski, Lukasz 0 siblings, 0 replies; 38+ messages in thread From: Anaczkowski, Lukasz @ 2016-06-15 13:12 UTC (permalink / raw) To: Borislav Petkov Cc: linux-kernel, linux-mm, tglx, mingo, dave.hansen, ak, kirill.shutemov, mhocko, akpm, hpa, Srinivasappa, Harish, Odzioba, Lukasz, Andrejczuk, Grzegorz, Daniluk, Lukasz From: Borislav Petkov [mailto:bp@alien8.de] Sent: Tuesday, June 14, 2016 8:10 PM >> + if (boot_cpu_has_bug(X86_BUG_PTE_LEAK)) > > static_cpu_has_bug() >> + if (c->x86_model == 87) { > > That should be INTEL_FAM6_XEON_PHI_KNL, AFAICT. >> + static bool printed; >> + >> + if (!printed) { >> + pr_info("Enabling PTE leaking workaround\n"); >> + printed = true; >> + } > > pr_info_once Thanks, Boris! This is very valuable. I'll address those comments in next version of the patch. Cheers, Lukasz ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v2] Linux VM workaround for Knights Landing A/D leak 2016-06-14 17:01 ` [PATCH v2] " Lukasz Anaczkowski 2016-06-14 17:24 ` Dave Hansen 2016-06-14 18:10 ` Borislav Petkov @ 2016-06-14 18:38 ` Nadav Amit 2016-06-15 13:12 ` Anaczkowski, Lukasz 2016-06-16 15:14 ` [PATCH v3] " Lukasz Anaczkowski 3 siblings, 1 reply; 38+ messages in thread From: Nadav Amit @ 2016-06-14 18:38 UTC (permalink / raw) To: Lukasz Anaczkowski Cc: LKML, linux-mm, Thomas Gleixner, Ingo Molnar, Dave Hansen, ak, kirill.shutemov, mhocko, Andrew Morton, H. Peter Anvin, harish.srinivasappa, lukasz.odzioba, grzegorz.andrejczuk, lukasz.daniluk Lukasz Anaczkowski <lukasz.anaczkowski@intel.com> wrote: > From: Andi Kleen <ak@linux.intel.com> > static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm, > unsigned long addr, pte_t *ptep) > { > - return ptep_get_and_clear(mm, addr, ptep); > + pte_t pte = ptep_get_and_clear(mm, addr, ptep); > + > + if (boot_cpu_has_bug(X86_BUG_PTE_LEAK)) > + fix_pte_leak(mm, addr, ptep); > + return pte; > } I missed it on the previous iteration: ptep_get_and_clear already calls fix_pte_leak when needed. So do you need to call it again here? Thanks, Nadav ^ permalink raw reply [flat|nested] 38+ messages in thread
* RE: [PATCH v2] Linux VM workaround for Knights Landing A/D leak 2016-06-14 18:38 ` Nadav Amit @ 2016-06-15 13:12 ` Anaczkowski, Lukasz 2016-06-15 20:04 ` Nadav Amit 0 siblings, 1 reply; 38+ messages in thread From: Anaczkowski, Lukasz @ 2016-06-15 13:12 UTC (permalink / raw) To: Nadav Amit Cc: LKML, linux-mm, Thomas Gleixner, Ingo Molnar, Dave Hansen, ak, kirill.shutemov, mhocko, Andrew Morton, H. Peter Anvin, Srinivasappa, Harish, Odzioba, Lukasz, Andrejczuk, Grzegorz, Daniluk, Lukasz From: Nadav Amit [mailto:nadav.amit@gmail.com] Sent: Tuesday, June 14, 2016 8:38 PM >> + pte_t pte = ptep_get_and_clear(mm, addr, ptep); >> + >> + if (boot_cpu_has_bug(X86_BUG_PTE_LEAK)) >> + fix_pte_leak(mm, addr, ptep); >> + return pte; >> } > > I missed it on the previous iteration: ptep_get_and_clear already calls > fix_pte_leak when needed. So do you need to call it again here? You're right, Nadav. Not needing this. Will be removed in next version of the patch. Cheers, Lukasz ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v2] Linux VM workaround for Knights Landing A/D leak 2016-06-15 13:12 ` Anaczkowski, Lukasz @ 2016-06-15 20:04 ` Nadav Amit 2016-06-15 20:10 ` Dave Hansen 0 siblings, 1 reply; 38+ messages in thread From: Nadav Amit @ 2016-06-15 20:04 UTC (permalink / raw) To: Anaczkowski, Lukasz Cc: LKML, linux-mm, Thomas Gleixner, Ingo Molnar, Dave Hansen, ak, kirill.shutemov, mhocko, Andrew Morton, H. Peter Anvin, Srinivasappa, Harish, Odzioba, Lukasz, Andrejczuk, Grzegorz, Daniluk, Lukasz Lukasz <lukasz.anaczkowski@intel.com> wrote: > From: Nadav Amit [mailto:nadav.amit@gmail.com] > Sent: Tuesday, June 14, 2016 8:38 PM > >>> + pte_t pte = ptep_get_and_clear(mm, addr, ptep); >>> + >>> + if (boot_cpu_has_bug(X86_BUG_PTE_LEAK)) >>> + fix_pte_leak(mm, addr, ptep); >>> + return pte; >>> } >> >> I missed it on the previous iteration: ptep_get_and_clear already calls >> fix_pte_leak when needed. So do you need to call it again here? > > You're right, Nadav. Not needing this. Will be removed in next version of the patch. Be careful here. According to the SDM when invalidating a huge-page, each 4KB page needs to be invalidated separately. In practice, when Linux invalidates 2MB/1GB pages it performs a full TLB flush. The full flush may not be required on knights landing, and specifically for the workaround, but you should check. Regards, Nadav ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v2] Linux VM workaround for Knights Landing A/D leak 2016-06-15 20:04 ` Nadav Amit @ 2016-06-15 20:10 ` Dave Hansen 2016-06-15 20:26 ` Nadav Amit 0 siblings, 1 reply; 38+ messages in thread From: Dave Hansen @ 2016-06-15 20:10 UTC (permalink / raw) To: Nadav Amit, Anaczkowski, Lukasz Cc: LKML, linux-mm, Thomas Gleixner, Ingo Molnar, ak, kirill.shutemov, mhocko, Andrew Morton, H. Peter Anvin, Srinivasappa, Harish, Odzioba, Lukasz, Andrejczuk, Grzegorz, Daniluk, Lukasz On 06/15/2016 01:04 PM, Nadav Amit wrote: > Be careful here. According to the SDM when invalidating a huge-page, > each 4KB page needs to be invalidated separately. In practice, when > Linux invalidates 2MB/1GB pages it performs a full TLB flush. The > full flush may not be required on knights landing, and specifically > for the workaround, but you should check. Where do you get that? The SDM says: "they (TLB invalidation operations invalidate all TLB entries corresponding to the translation specified by the paging structures." Here's the full paragraph from the SDM ... some processors may choose to cache multiple smaller-page TLB entries for a translation specified by the paging structures to use a page larger than 4 KBytes. There is no way for software to be aware that multiple translations for smaller pages have been used for a large page. The INVLPG instruction and page faults provide the same assurances that they provide when a single TLB entry is used: they invalidate all TLB entries corresponding to the translation specified by the paging structures. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v2] Linux VM workaround for Knights Landing A/D leak 2016-06-15 20:10 ` Dave Hansen @ 2016-06-15 20:26 ` Nadav Amit 0 siblings, 0 replies; 38+ messages in thread From: Nadav Amit @ 2016-06-15 20:26 UTC (permalink / raw) To: Dave Hansen Cc: Anaczkowski, Lukasz, LKML, linux-mm, Thomas Gleixner, Ingo Molnar, ak, kirill.shutemov, mhocko, Andrew Morton, H. Peter Anvin, Srinivasappa, Harish, Odzioba, Lukasz, Andrejczuk, Grzegorz, Daniluk, Lukasz Dave Hansen <dave.hansen@linux.intel.com> wrote: > On 06/15/2016 01:04 PM, Nadav Amit wrote: >> Be careful here. According to the SDM when invalidating a huge-page, >> each 4KB page needs to be invalidated separately. In practice, when >> Linux invalidates 2MB/1GB pages it performs a full TLB flush. The >> full flush may not be required on knights landing, and specifically >> for the workaround, but you should check. > > Where do you get that? The SDM says: "they (TLB invalidation operations > invalidate all TLB entries corresponding to the translation specified by > the paging structures.” You are absolutely correct. Last time I write something based on my recollection of the SDM without re-reading again. Sorry. Nadav ^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH v3] Linux VM workaround for Knights Landing A/D leak 2016-06-14 17:01 ` [PATCH v2] " Lukasz Anaczkowski ` (2 preceding siblings ...) 2016-06-14 18:38 ` Nadav Amit @ 2016-06-16 15:14 ` Lukasz Anaczkowski 2016-06-16 16:43 ` Nadav Amit 2016-06-16 20:23 ` Dave Hansen 3 siblings, 2 replies; 38+ messages in thread From: Lukasz Anaczkowski @ 2016-06-16 15:14 UTC (permalink / raw) To: hpa, mingo, tglx, dave.hansen, ak, kirill.shutemov, mhocko, akpm, linux-kernel, linux-mm Cc: lukasz.anaczkowski, harish.srinivasappa, lukasz.odzioba, grzegorz.andrejczuk, lukasz.daniluk From: Andi Kleen <ak@linux.intel.com> Knights Landing has an issue that a thread setting A or D bits may not do so atomically against checking the present bit. A thread which is going to page fault may still set those bits, even though the present bit was already atomically cleared. This implies that when the kernel clears present atomically, some time later the supposed to be zero entry could be corrupted with stray A or D bits. Since the PTE could be already used for storing a swap index, or a NUMA migration index, this cannot be tolerated. Most of the time the kernel detects the problem, but in some rare cases it may not. This patch enforces that the page unmap path in vmscan/direct reclaim always flushes other CPUs after clearing each page, and also clears the PTE again after the flush. A new memory barrier may be required, but this code is at least consistent with all the existing uses in the kernel. If we decide that we need a new barrier, it can added at the same time as the rest of the tree. For reclaim this brings the performance back to before Mel's flushing changes, but for unmap it disables batching. This makes sure any leaked A/D bits are immediately cleared before the entry is used for something else. Any parallel faults that check for entry is zero may loop, but they should eventually recover after the entry is written. Also other users may spin in the page table lock until we "fixed" the PTE. This is ensured by always taking the page table lock even for the swap cache case. Previously this was only done on architectures with non atomic PTE accesses (such as 32bit PTE), but now it is also done when this bug workaround is active. I audited apply_pte_range and other users of arch_enter_lazy... and they seem to all not clear the present bit. Right now the extra flush is done in the low level architecture code, while the higher level code still does batched TLB flush. This means there is always one extra unnecessary TLB flush now. As a followon optimization this could be avoided by telling the callers that the flush already happenend. The official erratum will be posted hopefully by the end of July'16. v3 (Lukasz Anaczkowski): () Improved documentation () Removed unnecessary declaration and call to fix_pte_leak from hugetlb.h () Moved fix_pte_leak() definition from tlb.c to intel.c () Replaced boot_cpu_has_bug() with static_cpu_has_bug() () pr_info_once instead of pr_info () Fix applies only to 64-bit kernels as Knights Landing does not support 32-bit kernels v2 (Lukasz Anaczkowski): () added call to smp_mb__after_atomic() to synchornize with switch_mm, based on Nadav's comment () fixed compilation breakage Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Lukasz Anaczkowski <lukasz.anaczkowski@intel.com> --- arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/pgtable.h | 10 ++++++++++ arch/x86/include/asm/pgtable_64.h | 16 +++++++++++++++ arch/x86/kernel/cpu/intel.c | 41 ++++++++++++++++++++++++++++++++++++++ include/linux/mm.h | 4 ++++ mm/memory.c | 3 ++- 6 files changed, 74 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 4a41348..2c48011 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -303,6 +303,7 @@ #define X86_BUG_SYSRET_SS_ATTRS X86_BUG(8) /* SYSRET doesn't fix up SS attrs */ #define X86_BUG_NULL_SEG X86_BUG(9) /* Nulling a selector preserves the base */ #define X86_BUG_SWAPGS_FENCE X86_BUG(10) /* SWAPGS without input dep on GS */ +#define X86_BUG_PTE_LEAK X86_BUG(11) /* PTE may leak A/D bits after clear */ #ifdef CONFIG_X86_32 diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 1a27396..14abcb3 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -794,11 +794,21 @@ extern int ptep_test_and_clear_young(struct vm_area_struct *vma, extern int ptep_clear_flush_young(struct vm_area_struct *vma, unsigned long address, pte_t *ptep); +#if defined(CONFIG_X86_64) && defined(CONFIG_CPU_SUP_INTEL) +extern void fix_pte_leak(struct mm_struct *mm, unsigned long addr, + pte_t *ptep); +#else +static inline void fix_pte_leak(struct mm_struct *mm, unsigned long addr, + pte_t *ptep) {} +#endif + #define __HAVE_ARCH_PTEP_GET_AND_CLEAR static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { pte_t pte = native_ptep_get_and_clear(ptep); + if (static_cpu_has_bug(X86_BUG_PTE_LEAK)) + fix_pte_leak(mm, addr, ptep); pte_update(mm, addr, ptep); return pte; } diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h index 2ee7811..be7d63c 100644 --- a/arch/x86/include/asm/pgtable_64.h +++ b/arch/x86/include/asm/pgtable_64.h @@ -178,6 +178,22 @@ extern void cleanup_highmap(void); extern void init_extra_mapping_uc(unsigned long phys, unsigned long size); extern void init_extra_mapping_wb(unsigned long phys, unsigned long size); +/* + * Intel Xeon Phi x200 codenamed Knights Landing has an issue that a thread + * setting A or D bits may not do so atomically against checking the present + * bit. A thread which is going to page fault may still set those + * bits, even though the present bit was already atomically cleared. + * + * This implies that when the kernel clears present atomically, + * some time later the supposed to be zero entry could be corrupted + * with stray A or D bits. + */ +#define ARCH_HAS_NEEDS_SWAP_PTL 1 +static inline bool arch_needs_swap_ptl(void) +{ + return static_cpu_has_bug(X86_BUG_PTE_LEAK); +} + #endif /* !__ASSEMBLY__ */ #endif /* _ASM_X86_PGTABLE_64_H */ diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index 6e2ffbe..b5f5bab 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -7,12 +7,17 @@ #include <linux/thread_info.h> #include <linux/module.h> #include <linux/uaccess.h> +#include <linux/mm_types.h> #include <asm/cpufeature.h> #include <asm/pgtable.h> #include <asm/msr.h> #include <asm/bugs.h> #include <asm/cpu.h> +#include <asm/intel-family.h> +#include <asm/tlbflush.h> + +#include <trace/events/tlb.h> #ifdef CONFIG_X86_64 #include <linux/topology.h> @@ -60,6 +65,35 @@ void check_mpx_erratum(struct cpuinfo_x86 *c) } } +#ifdef CONFIG_X86_64 +/* + * Knights Landing has an issue that a thread setting A or D bits + * may not do so atomically against checking the present bit. + * A thread which is going to page fault may still set those + * bits, even though the present bit was already atomically cleared. + * + * Entering here, the current CPU just cleared the PTE. But, + * another thread may have raced and set the A or D bits, or be + * _about_ to set the bits. Shooting their TLB entry down will + * ensure they see the cleared PTE and will not set A or D and + * won't corrupt swap entries. + */ +void fix_pte_leak(struct mm_struct *mm, unsigned long addr, pte_t *ptep) +{ + if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids) { + trace_tlb_flush(TLB_LOCAL_SHOOTDOWN, TLB_FLUSH_ALL); + flush_tlb_others(mm_cpumask(mm), mm, addr, + addr + PAGE_SIZE); + mb(); + /* + * Clear the PTE one more time, in case the other thread set A/D + * before we sent the TLB flush. + */ + set_pte(ptep, __pte(0)); + } +} +#endif + static void early_init_intel(struct cpuinfo_x86 *c) { u64 misc_enable; @@ -181,6 +215,13 @@ static void early_init_intel(struct cpuinfo_x86 *c) } } +#ifdef CONFIG_X86_64 + if (c->x86_model == INTEL_FAM6_XEON_PHI_KNL) { + pr_info_once("x86/intel: Enabling PTE leaking workaround\n"); + set_cpu_bug(c, X86_BUG_PTE_LEAK); + } +#endif + /* * Intel Quark Core DevMan_001.pdf section 6.4.11 * "The operating system also is required to invalidate (i.e., flush) diff --git a/include/linux/mm.h b/include/linux/mm.h index 5df5feb..5c80fe09 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2404,6 +2404,10 @@ static inline bool debug_guardpage_enabled(void) { return false; } static inline bool page_is_guard(struct page *page) { return false; } #endif /* CONFIG_DEBUG_PAGEALLOC */ +#ifndef ARCH_HAS_NEEDS_SWAP_PTL +static inline bool arch_needs_swap_ptl(void) { return false; } +#endif + #if MAX_NUMNODES > 1 void __init setup_nr_node_ids(void); #else diff --git a/mm/memory.c b/mm/memory.c index 15322b7..0d6ef39 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1960,7 +1960,8 @@ static inline int pte_unmap_same(struct mm_struct *mm, pmd_t *pmd, { int same = 1; #if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT) - if (sizeof(pte_t) > sizeof(unsigned long)) { + if (arch_needs_swap_ptl() || + sizeof(pte_t) > sizeof(unsigned long)) { spinlock_t *ptl = pte_lockptr(mm, pmd); spin_lock(ptl); same = pte_same(*page_table, orig_pte); -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: [PATCH v3] Linux VM workaround for Knights Landing A/D leak 2016-06-16 15:14 ` [PATCH v3] " Lukasz Anaczkowski @ 2016-06-16 16:43 ` Nadav Amit 2016-06-16 20:23 ` Dave Hansen 1 sibling, 0 replies; 38+ messages in thread From: Nadav Amit @ 2016-06-16 16:43 UTC (permalink / raw) To: Lukasz Anaczkowski Cc: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Dave Hansen, ak, kirill.shutemov, mhocko, akpm, linux-kernel, linux-mm, harish.srinivasappa, lukasz.odzioba, grzegorz.andrejczuk, lukasz.daniluk Lukasz Anaczkowski <lukasz.anaczkowski@intel.com> wrote: > From: Andi Kleen <ak@linux.intel.com> > > +void fix_pte_leak(struct mm_struct *mm, unsigned long addr, pte_t *ptep) > +{ > + if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids) { > + trace_tlb_flush(TLB_LOCAL_SHOOTDOWN, TLB_FLUSH_ALL); This tracing seems incorrect since you don’t perform a local flush. I don’t think you need any tracing - native_flush_tlb_others will do it for you. > + flush_tlb_others(mm_cpumask(mm), mm, addr, > + addr + PAGE_SIZE); > + mb(); Why do you need the memory barrier? Regards, Nadav ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v3] Linux VM workaround for Knights Landing A/D leak 2016-06-16 15:14 ` [PATCH v3] " Lukasz Anaczkowski 2016-06-16 16:43 ` Nadav Amit @ 2016-06-16 20:23 ` Dave Hansen 1 sibling, 0 replies; 38+ messages in thread From: Dave Hansen @ 2016-06-16 20:23 UTC (permalink / raw) To: Lukasz Anaczkowski, hpa, mingo, tglx, ak, kirill.shutemov, mhocko, akpm, linux-kernel, linux-mm Cc: harish.srinivasappa, lukasz.odzioba, grzegorz.andrejczuk, lukasz.daniluk On 06/16/2016 08:14 AM, Lukasz Anaczkowski wrote: > For reclaim this brings the performance back to before Mel's > flushing changes, but for unmap it disables batching. This turns out to be pretty catastrophic for unmap. In a workload that uses, say 200 hardware threads and alloc/frees() a few MB/sec, this ends up costing hundreds of thousands of extra received IPIs. 10MB=~2500 ptes, and at with 200 threads, that's 250,000 IPIs received just to free 10MB of memory. The initial testing we did on this was on a *bunch* of threads all doing alloc/free. But this is bottlenecked on other things, like mmap_sem being held for write. The scenario that we really needed to test here was on lots of threads doing processing and 1 thread doing alloc/free. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] Linux VM workaround for Knights Landing A/D leak 2016-06-14 16:47 ` Nadav Amit 2016-06-14 16:54 ` Anaczkowski, Lukasz 2016-06-14 17:01 ` [PATCH v2] " Lukasz Anaczkowski @ 2016-06-14 17:18 ` Dave Hansen 2016-06-14 20:16 ` Nadav Amit 2 siblings, 1 reply; 38+ messages in thread From: Dave Hansen @ 2016-06-14 17:18 UTC (permalink / raw) To: Nadav Amit, Lukasz Anaczkowski Cc: LKML, linux-mm, Thomas Gleixner, Ingo Molnar, ak, kirill.shutemov, mhocko, Andrew Morton, H. Peter Anvin, harish.srinivasappa, lukasz.odzioba On 06/14/2016 09:47 AM, Nadav Amit wrote: > Lukasz Anaczkowski <lukasz.anaczkowski@intel.com> wrote: > >> > From: Andi Kleen <ak@linux.intel.com> >> > +void fix_pte_leak(struct mm_struct *mm, unsigned long addr, pte_t *ptep) >> > +{ > Here there should be a call to smp_mb__after_atomic() to synchronize with > switch_mm. I submitted a similar patch, which is still pending (hint). > >> > + if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids) { >> > + trace_tlb_flush(TLB_LOCAL_SHOOTDOWN, TLB_FLUSH_ALL); >> > + flush_tlb_others(mm_cpumask(mm), mm, addr, >> > + addr + PAGE_SIZE); >> > + mb(); >> > + set_pte(ptep, __pte(0)); >> > + } >> > +} Shouldn't that barrier be incorporated in the TLB flush code itself and not every single caller (like this code is)? It is insane to require individual TLB flushers to be concerned with the barriers. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] Linux VM workaround for Knights Landing A/D leak 2016-06-14 17:18 ` [PATCH] " Dave Hansen @ 2016-06-14 20:16 ` Nadav Amit 2016-06-14 21:37 ` Dave Hansen 0 siblings, 1 reply; 38+ messages in thread From: Nadav Amit @ 2016-06-14 20:16 UTC (permalink / raw) To: Dave Hansen Cc: Lukasz Anaczkowski, LKML, linux-mm, Thomas Gleixner, Ingo Molnar, ak, kirill.shutemov, mhocko, Andrew Morton, H. Peter Anvin, harish.srinivasappa, lukasz.odzioba Dave Hansen <dave.hansen@linux.intel.com> wrote: > On 06/14/2016 09:47 AM, Nadav Amit wrote: >> Lukasz Anaczkowski <lukasz.anaczkowski@intel.com> wrote: >> >>>> From: Andi Kleen <ak@linux.intel.com> >>>> +void fix_pte_leak(struct mm_struct *mm, unsigned long addr, pte_t *ptep) >>>> +{ >> Here there should be a call to smp_mb__after_atomic() to synchronize with >> switch_mm. I submitted a similar patch, which is still pending (hint). >> >>>> + if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids) { >>>> + trace_tlb_flush(TLB_LOCAL_SHOOTDOWN, TLB_FLUSH_ALL); >>>> + flush_tlb_others(mm_cpumask(mm), mm, addr, >>>> + addr + PAGE_SIZE); >>>> + mb(); >>>> + set_pte(ptep, __pte(0)); >>>> + } >>>> +} > > Shouldn't that barrier be incorporated in the TLB flush code itself and > not every single caller (like this code is)? > > It is insane to require individual TLB flushers to be concerned with the > barriers. IMHO it is best to use existing flushing interfaces instead of creating new ones. In theory, fix_pte_leak could have used flush_tlb_page. But the problem is that flush_tlb_page requires the vm_area_struct as an argument, which ptep_get_and_clear (and others) do not have. I don’t know which architecture needs the vm_area_struct, since x86 and some others I looked at (e.g., ARM) only need the mm_struct. Nadav ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] Linux VM workaround for Knights Landing A/D leak 2016-06-14 20:16 ` Nadav Amit @ 2016-06-14 21:37 ` Dave Hansen 2016-06-15 2:20 ` Andy Lutomirski 2016-06-15 3:20 ` Nadav Amit 0 siblings, 2 replies; 38+ messages in thread From: Dave Hansen @ 2016-06-14 21:37 UTC (permalink / raw) To: Nadav Amit Cc: Lukasz Anaczkowski, LKML, linux-mm, Thomas Gleixner, Ingo Molnar, ak, kirill.shutemov, mhocko, Andrew Morton, H. Peter Anvin, harish.srinivasappa, lukasz.odzioba, Andy Lutomirski On 06/14/2016 01:16 PM, Nadav Amit wrote: > Dave Hansen <dave.hansen@linux.intel.com> wrote: > >> On 06/14/2016 09:47 AM, Nadav Amit wrote: >>> Lukasz Anaczkowski <lukasz.anaczkowski@intel.com> wrote: >>> >>>>> From: Andi Kleen <ak@linux.intel.com> >>>>> +void fix_pte_leak(struct mm_struct *mm, unsigned long addr, pte_t *ptep) >>>>> +{ >>> Here there should be a call to smp_mb__after_atomic() to synchronize with >>> switch_mm. I submitted a similar patch, which is still pending (hint). >>> >>>>> + if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids) { >>>>> + trace_tlb_flush(TLB_LOCAL_SHOOTDOWN, TLB_FLUSH_ALL); >>>>> + flush_tlb_others(mm_cpumask(mm), mm, addr, >>>>> + addr + PAGE_SIZE); >>>>> + mb(); >>>>> + set_pte(ptep, __pte(0)); >>>>> + } >>>>> +} >> >> Shouldn't that barrier be incorporated in the TLB flush code itself and >> not every single caller (like this code is)? >> >> It is insane to require individual TLB flushers to be concerned with the >> barriers. > > IMHO it is best to use existing flushing interfaces instead of creating > new ones. Yeah, or make these things a _little_ harder to get wrong. That little snippet above isn't so crazy that we should be depending on open-coded barriers to get it right. Should we just add a barrier to mm_cpumask() itself? That should stop the race. Or maybe we need a new primitive like: /* * Call this if a full barrier has been executed since the last * pagetable modification operation. */ static int __other_cpus_need_tlb_flush(struct mm_struct *mm) { /* cpumask_any_but() returns >= nr_cpu_ids if no cpus set. */ return cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids; } static int other_cpus_need_tlb_flush(struct mm_struct *mm) { /* * Synchronizes with switch_mm. Makes sure that we do not * observe a bit having been cleared in mm_cpumask() before * the other processor has seen our pagetable update. See * switch_mm(). */ smp_mb__after_atomic(); return __other_cpus_need_tlb_flush(mm) } We should be able to deploy other_cpus_need_tlb_flush() in most of the cases where we are doing "cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids". Right? > In theory, fix_pte_leak could have used flush_tlb_page. But the problem > is that flush_tlb_page requires the vm_area_struct as an argument, which > ptep_get_and_clear (and others) do not have. That, and we do not want/need to flush the _current_ processor's TLB. flush_tlb_page() would have done that unnecessarily. That's not the end of the world here, but it is a downside. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] Linux VM workaround for Knights Landing A/D leak 2016-06-14 21:37 ` Dave Hansen @ 2016-06-15 2:20 ` Andy Lutomirski 2016-06-15 2:35 ` Nadav Amit 2016-06-15 3:20 ` Nadav Amit 1 sibling, 1 reply; 38+ messages in thread From: Andy Lutomirski @ 2016-06-15 2:20 UTC (permalink / raw) To: Dave Hansen Cc: Nadav Amit, Lukasz Anaczkowski, LKML, linux-mm, Thomas Gleixner, Ingo Molnar, Andi Kleen, Kirill A. Shutemov, Michal Hocko, Andrew Morton, H. Peter Anvin, harish.srinivasappa, lukasz.odzioba On Tue, Jun 14, 2016 at 2:37 PM, Dave Hansen <dave.hansen@linux.intel.com> wrote: > On 06/14/2016 01:16 PM, Nadav Amit wrote: >> Dave Hansen <dave.hansen@linux.intel.com> wrote: >> >>> On 06/14/2016 09:47 AM, Nadav Amit wrote: >>>> Lukasz Anaczkowski <lukasz.anaczkowski@intel.com> wrote: >>>> >>>>>> From: Andi Kleen <ak@linux.intel.com> >>>>>> +void fix_pte_leak(struct mm_struct *mm, unsigned long addr, pte_t *ptep) >>>>>> +{ >>>> Here there should be a call to smp_mb__after_atomic() to synchronize with >>>> switch_mm. I submitted a similar patch, which is still pending (hint). >>>> >>>>>> + if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids) { >>>>>> + trace_tlb_flush(TLB_LOCAL_SHOOTDOWN, TLB_FLUSH_ALL); >>>>>> + flush_tlb_others(mm_cpumask(mm), mm, addr, >>>>>> + addr + PAGE_SIZE); >>>>>> + mb(); >>>>>> + set_pte(ptep, __pte(0)); >>>>>> + } >>>>>> +} >>> >>> Shouldn't that barrier be incorporated in the TLB flush code itself and >>> not every single caller (like this code is)? >>> >>> It is insane to require individual TLB flushers to be concerned with the >>> barriers. >> >> IMHO it is best to use existing flushing interfaces instead of creating >> new ones. > > Yeah, or make these things a _little_ harder to get wrong. That little > snippet above isn't so crazy that we should be depending on open-coded > barriers to get it right. > > Should we just add a barrier to mm_cpumask() itself? That should stop > the race. Or maybe we need a new primitive like: > > /* > * Call this if a full barrier has been executed since the last > * pagetable modification operation. > */ > static int __other_cpus_need_tlb_flush(struct mm_struct *mm) > { > /* cpumask_any_but() returns >= nr_cpu_ids if no cpus set. */ > return cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < > nr_cpu_ids; > } > > > static int other_cpus_need_tlb_flush(struct mm_struct *mm) > { > /* > * Synchronizes with switch_mm. Makes sure that we do not > * observe a bit having been cleared in mm_cpumask() before > * the other processor has seen our pagetable update. See > * switch_mm(). > */ > smp_mb__after_atomic(); > > return __other_cpus_need_tlb_flush(mm) > } > > We should be able to deploy other_cpus_need_tlb_flush() in most of the > cases where we are doing "cpumask_any_but(mm_cpumask(mm), > smp_processor_id()) < nr_cpu_ids". IMO this is a bit nuts. smp_mb__after_atomic() doesn't do anything on x86. And, even if it did, why should the flush code assume that the previous store was atomic? What's the issue being fixed / worked around here? ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] Linux VM workaround for Knights Landing A/D leak 2016-06-15 2:20 ` Andy Lutomirski @ 2016-06-15 2:35 ` Nadav Amit 2016-06-15 2:36 ` Andy Lutomirski 0 siblings, 1 reply; 38+ messages in thread From: Nadav Amit @ 2016-06-15 2:35 UTC (permalink / raw) To: Andy Lutomirski Cc: Dave Hansen, Lukasz Anaczkowski, LKML, linux-mm, Thomas Gleixner, Ingo Molnar, Andi Kleen, Kirill A. Shutemov, Michal Hocko, Andrew Morton, H. Peter Anvin, harish.srinivasappa, lukasz.odzioba Andy Lutomirski <luto@amacapital.net> wrote: > On Tue, Jun 14, 2016 at 2:37 PM, Dave Hansen > <dave.hansen@linux.intel.com> wrote: >> On 06/14/2016 01:16 PM, Nadav Amit wrote: >>> Dave Hansen <dave.hansen@linux.intel.com> wrote: >>> >>>> On 06/14/2016 09:47 AM, Nadav Amit wrote: >>>>> Lukasz Anaczkowski <lukasz.anaczkowski@intel.com> wrote: >>>>> >>>>>>> From: Andi Kleen <ak@linux.intel.com> >>>>>>> +void fix_pte_leak(struct mm_struct *mm, unsigned long addr, pte_t *ptep) >>>>>>> +{ >>>>> Here there should be a call to smp_mb__after_atomic() to synchronize with >>>>> switch_mm. I submitted a similar patch, which is still pending (hint). >>>>> >>>>>>> + if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids) { >>>>>>> + trace_tlb_flush(TLB_LOCAL_SHOOTDOWN, TLB_FLUSH_ALL); >>>>>>> + flush_tlb_others(mm_cpumask(mm), mm, addr, >>>>>>> + addr + PAGE_SIZE); >>>>>>> + mb(); >>>>>>> + set_pte(ptep, __pte(0)); >>>>>>> + } >>>>>>> +} >>>> >>>> Shouldn't that barrier be incorporated in the TLB flush code itself and >>>> not every single caller (like this code is)? >>>> >>>> It is insane to require individual TLB flushers to be concerned with the >>>> barriers. >>> >>> IMHO it is best to use existing flushing interfaces instead of creating >>> new ones. >> >> Yeah, or make these things a _little_ harder to get wrong. That little >> snippet above isn't so crazy that we should be depending on open-coded >> barriers to get it right. >> >> Should we just add a barrier to mm_cpumask() itself? That should stop >> the race. Or maybe we need a new primitive like: >> >> /* >> * Call this if a full barrier has been executed since the last >> * pagetable modification operation. >> */ >> static int __other_cpus_need_tlb_flush(struct mm_struct *mm) >> { >> /* cpumask_any_but() returns >= nr_cpu_ids if no cpus set. */ >> return cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < >> nr_cpu_ids; >> } >> >> >> static int other_cpus_need_tlb_flush(struct mm_struct *mm) >> { >> /* >> * Synchronizes with switch_mm. Makes sure that we do not >> * observe a bit having been cleared in mm_cpumask() before >> * the other processor has seen our pagetable update. See >> * switch_mm(). >> */ >> smp_mb__after_atomic(); >> >> return __other_cpus_need_tlb_flush(mm) >> } >> >> We should be able to deploy other_cpus_need_tlb_flush() in most of the >> cases where we are doing "cpumask_any_but(mm_cpumask(mm), >> smp_processor_id()) < nr_cpu_ids". > > IMO this is a bit nuts. smp_mb__after_atomic() doesn't do anything on > x86. And, even if it did, why should the flush code assume that the > previous store was atomic? > > What's the issue being fixed / worked around here? It does a compiler barrier, which prevents the decision whether a remote TLB shootdown is required to be made before the PTE is set. I agree that PTEs may not be written atomically in certain cases (although I am unaware of such cases, except on full-mm flush). Having said that, I think that all the TLB flush/shootdown logic should not be open-coded at all and be left in the arch-specific implementation. People keep making small mistakes when they reimplement the flushing logic. This patch, for example, also has a bug in the way it traces the flush - it marks full flush, when it flushes a single page: >>>>>>> trace_tlb_flush(TLB_LOCAL_SHOOTDOWN, TLB_FLUSH_ALL); Regards, Nadav ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] Linux VM workaround for Knights Landing A/D leak 2016-06-15 2:35 ` Nadav Amit @ 2016-06-15 2:36 ` Andy Lutomirski 2016-06-15 2:44 ` Nadav Amit 0 siblings, 1 reply; 38+ messages in thread From: Andy Lutomirski @ 2016-06-15 2:36 UTC (permalink / raw) To: Nadav Amit Cc: Dave Hansen, Lukasz Anaczkowski, LKML, linux-mm, Thomas Gleixner, Ingo Molnar, Andi Kleen, Kirill A. Shutemov, Michal Hocko, Andrew Morton, H. Peter Anvin, harish.srinivasappa, lukasz.odzioba On Tue, Jun 14, 2016 at 7:35 PM, Nadav Amit <nadav.amit@gmail.com> wrote: > Andy Lutomirski <luto@amacapital.net> wrote: > >> On Tue, Jun 14, 2016 at 2:37 PM, Dave Hansen >> <dave.hansen@linux.intel.com> wrote: >>> On 06/14/2016 01:16 PM, Nadav Amit wrote: >>>> Dave Hansen <dave.hansen@linux.intel.com> wrote: >>>> >>>>> On 06/14/2016 09:47 AM, Nadav Amit wrote: >>>>>> Lukasz Anaczkowski <lukasz.anaczkowski@intel.com> wrote: >>>>>> >>>>>>>> From: Andi Kleen <ak@linux.intel.com> >>>>>>>> +void fix_pte_leak(struct mm_struct *mm, unsigned long addr, pte_t *ptep) >>>>>>>> +{ >>>>>> Here there should be a call to smp_mb__after_atomic() to synchronize with >>>>>> switch_mm. I submitted a similar patch, which is still pending (hint). >>>>>> >>>>>>>> + if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids) { >>>>>>>> + trace_tlb_flush(TLB_LOCAL_SHOOTDOWN, TLB_FLUSH_ALL); >>>>>>>> + flush_tlb_others(mm_cpumask(mm), mm, addr, >>>>>>>> + addr + PAGE_SIZE); >>>>>>>> + mb(); >>>>>>>> + set_pte(ptep, __pte(0)); >>>>>>>> + } >>>>>>>> +} >>>>> >>>>> Shouldn't that barrier be incorporated in the TLB flush code itself and >>>>> not every single caller (like this code is)? >>>>> >>>>> It is insane to require individual TLB flushers to be concerned with the >>>>> barriers. >>>> >>>> IMHO it is best to use existing flushing interfaces instead of creating >>>> new ones. >>> >>> Yeah, or make these things a _little_ harder to get wrong. That little >>> snippet above isn't so crazy that we should be depending on open-coded >>> barriers to get it right. >>> >>> Should we just add a barrier to mm_cpumask() itself? That should stop >>> the race. Or maybe we need a new primitive like: >>> >>> /* >>> * Call this if a full barrier has been executed since the last >>> * pagetable modification operation. >>> */ >>> static int __other_cpus_need_tlb_flush(struct mm_struct *mm) >>> { >>> /* cpumask_any_but() returns >= nr_cpu_ids if no cpus set. */ >>> return cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < >>> nr_cpu_ids; >>> } >>> >>> >>> static int other_cpus_need_tlb_flush(struct mm_struct *mm) >>> { >>> /* >>> * Synchronizes with switch_mm. Makes sure that we do not >>> * observe a bit having been cleared in mm_cpumask() before >>> * the other processor has seen our pagetable update. See >>> * switch_mm(). >>> */ >>> smp_mb__after_atomic(); >>> >>> return __other_cpus_need_tlb_flush(mm) >>> } >>> >>> We should be able to deploy other_cpus_need_tlb_flush() in most of the >>> cases where we are doing "cpumask_any_but(mm_cpumask(mm), >>> smp_processor_id()) < nr_cpu_ids". >> >> IMO this is a bit nuts. smp_mb__after_atomic() doesn't do anything on >> x86. And, even if it did, why should the flush code assume that the >> previous store was atomic? >> >> What's the issue being fixed / worked around here? > > It does a compiler barrier, which prevents the decision whether a > remote TLB shootdown is required to be made before the PTE is set. > > I agree that PTEs may not be written atomically in certain cases > (although I am unaware of such cases, except on full-mm flush). How about plain set_pte? It's atomic (aligned word-sized write), but it's not atomic in the _after_atomic sense. --Andy ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] Linux VM workaround for Knights Landing A/D leak 2016-06-15 2:36 ` Andy Lutomirski @ 2016-06-15 2:44 ` Nadav Amit 2016-06-15 3:09 ` Andy Lutomirski 0 siblings, 1 reply; 38+ messages in thread From: Nadav Amit @ 2016-06-15 2:44 UTC (permalink / raw) To: Andy Lutomirski Cc: Dave Hansen, Lukasz Anaczkowski, LKML, linux-mm, Thomas Gleixner, Ingo Molnar, Andi Kleen, Kirill A. Shutemov, Michal Hocko, Andrew Morton, H. Peter Anvin, harish.srinivasappa, lukasz.odzioba Andy Lutomirski <luto@amacapital.net> wrote: > On Tue, Jun 14, 2016 at 7:35 PM, Nadav Amit <nadav.amit@gmail.com> wrote: >> Andy Lutomirski <luto@amacapital.net> wrote: >> >>> On Tue, Jun 14, 2016 at 2:37 PM, Dave Hansen >>> <dave.hansen@linux.intel.com> wrote: >>>> On 06/14/2016 01:16 PM, Nadav Amit wrote: >>>>> Dave Hansen <dave.hansen@linux.intel.com> wrote: >>>>> >>>>>> On 06/14/2016 09:47 AM, Nadav Amit wrote: >>>>>>> Lukasz Anaczkowski <lukasz.anaczkowski@intel.com> wrote: >>>>>>> >>>>>>>>> From: Andi Kleen <ak@linux.intel.com> >>>>>>>>> +void fix_pte_leak(struct mm_struct *mm, unsigned long addr, pte_t *ptep) >>>>>>>>> +{ >>>>>>> Here there should be a call to smp_mb__after_atomic() to synchronize with >>>>>>> switch_mm. I submitted a similar patch, which is still pending (hint). >>>>>>> >>>>>>>>> + if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids) { >>>>>>>>> + trace_tlb_flush(TLB_LOCAL_SHOOTDOWN, TLB_FLUSH_ALL); >>>>>>>>> + flush_tlb_others(mm_cpumask(mm), mm, addr, >>>>>>>>> + addr + PAGE_SIZE); >>>>>>>>> + mb(); >>>>>>>>> + set_pte(ptep, __pte(0)); >>>>>>>>> + } >>>>>>>>> +} >>>>>> >>>>>> Shouldn't that barrier be incorporated in the TLB flush code itself and >>>>>> not every single caller (like this code is)? >>>>>> >>>>>> It is insane to require individual TLB flushers to be concerned with the >>>>>> barriers. >>>>> >>>>> IMHO it is best to use existing flushing interfaces instead of creating >>>>> new ones. >>>> >>>> Yeah, or make these things a _little_ harder to get wrong. That little >>>> snippet above isn't so crazy that we should be depending on open-coded >>>> barriers to get it right. >>>> >>>> Should we just add a barrier to mm_cpumask() itself? That should stop >>>> the race. Or maybe we need a new primitive like: >>>> >>>> /* >>>> * Call this if a full barrier has been executed since the last >>>> * pagetable modification operation. >>>> */ >>>> static int __other_cpus_need_tlb_flush(struct mm_struct *mm) >>>> { >>>> /* cpumask_any_but() returns >= nr_cpu_ids if no cpus set. */ >>>> return cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < >>>> nr_cpu_ids; >>>> } >>>> >>>> >>>> static int other_cpus_need_tlb_flush(struct mm_struct *mm) >>>> { >>>> /* >>>> * Synchronizes with switch_mm. Makes sure that we do not >>>> * observe a bit having been cleared in mm_cpumask() before >>>> * the other processor has seen our pagetable update. See >>>> * switch_mm(). >>>> */ >>>> smp_mb__after_atomic(); >>>> >>>> return __other_cpus_need_tlb_flush(mm) >>>> } >>>> >>>> We should be able to deploy other_cpus_need_tlb_flush() in most of the >>>> cases where we are doing "cpumask_any_but(mm_cpumask(mm), >>>> smp_processor_id()) < nr_cpu_ids". >>> >>> IMO this is a bit nuts. smp_mb__after_atomic() doesn't do anything on >>> x86. And, even if it did, why should the flush code assume that the >>> previous store was atomic? >>> >>> What's the issue being fixed / worked around here? >> >> It does a compiler barrier, which prevents the decision whether a >> remote TLB shootdown is required to be made before the PTE is set. >> >> I agree that PTEs may not be written atomically in certain cases >> (although I am unaware of such cases, except on full-mm flush). > > How about plain set_pte? It's atomic (aligned word-sized write), but > it's not atomic in the _after_atomic sense. Can you point me to a place where set_pte is used before a TLB invalidation/shootdown, excluding this patch and the fullmm case? I am not claiming there is no such case, but I am unaware of such one. PTEs are cleared on SMP using xchg, and similarly the dirty bit is cleared with an atomic operation. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] Linux VM workaround for Knights Landing A/D leak 2016-06-15 2:44 ` Nadav Amit @ 2016-06-15 3:09 ` Andy Lutomirski 0 siblings, 0 replies; 38+ messages in thread From: Andy Lutomirski @ 2016-06-15 3:09 UTC (permalink / raw) To: Nadav Amit Cc: Dave Hansen, Lukasz Anaczkowski, LKML, linux-mm, Thomas Gleixner, Ingo Molnar, Andi Kleen, Kirill A. Shutemov, Michal Hocko, Andrew Morton, H. Peter Anvin, harish.srinivasappa, lukasz.odzioba On Tue, Jun 14, 2016 at 7:44 PM, Nadav Amit <nadav.amit@gmail.com> wrote: > Andy Lutomirski <luto@amacapital.net> wrote: > >> On Tue, Jun 14, 2016 at 7:35 PM, Nadav Amit <nadav.amit@gmail.com> wrote: >>> Andy Lutomirski <luto@amacapital.net> wrote: >>> >>>> On Tue, Jun 14, 2016 at 2:37 PM, Dave Hansen >>>> <dave.hansen@linux.intel.com> wrote: >>>>> On 06/14/2016 01:16 PM, Nadav Amit wrote: >>>>>> Dave Hansen <dave.hansen@linux.intel.com> wrote: >>>>>> >>>>>>> On 06/14/2016 09:47 AM, Nadav Amit wrote: >>>>>>>> Lukasz Anaczkowski <lukasz.anaczkowski@intel.com> wrote: >>>>>>>> >>>>>>>>>> From: Andi Kleen <ak@linux.intel.com> >>>>>>>>>> +void fix_pte_leak(struct mm_struct *mm, unsigned long addr, pte_t *ptep) >>>>>>>>>> +{ >>>>>>>> Here there should be a call to smp_mb__after_atomic() to synchronize with >>>>>>>> switch_mm. I submitted a similar patch, which is still pending (hint). >>>>>>>> >>>>>>>>>> + if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids) { >>>>>>>>>> + trace_tlb_flush(TLB_LOCAL_SHOOTDOWN, TLB_FLUSH_ALL); >>>>>>>>>> + flush_tlb_others(mm_cpumask(mm), mm, addr, >>>>>>>>>> + addr + PAGE_SIZE); >>>>>>>>>> + mb(); >>>>>>>>>> + set_pte(ptep, __pte(0)); >>>>>>>>>> + } >>>>>>>>>> +} >>>>>>> >>>>>>> Shouldn't that barrier be incorporated in the TLB flush code itself and >>>>>>> not every single caller (like this code is)? >>>>>>> >>>>>>> It is insane to require individual TLB flushers to be concerned with the >>>>>>> barriers. >>>>>> >>>>>> IMHO it is best to use existing flushing interfaces instead of creating >>>>>> new ones. >>>>> >>>>> Yeah, or make these things a _little_ harder to get wrong. That little >>>>> snippet above isn't so crazy that we should be depending on open-coded >>>>> barriers to get it right. >>>>> >>>>> Should we just add a barrier to mm_cpumask() itself? That should stop >>>>> the race. Or maybe we need a new primitive like: >>>>> >>>>> /* >>>>> * Call this if a full barrier has been executed since the last >>>>> * pagetable modification operation. >>>>> */ >>>>> static int __other_cpus_need_tlb_flush(struct mm_struct *mm) >>>>> { >>>>> /* cpumask_any_but() returns >= nr_cpu_ids if no cpus set. */ >>>>> return cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < >>>>> nr_cpu_ids; >>>>> } >>>>> >>>>> >>>>> static int other_cpus_need_tlb_flush(struct mm_struct *mm) >>>>> { >>>>> /* >>>>> * Synchronizes with switch_mm. Makes sure that we do not >>>>> * observe a bit having been cleared in mm_cpumask() before >>>>> * the other processor has seen our pagetable update. See >>>>> * switch_mm(). >>>>> */ >>>>> smp_mb__after_atomic(); >>>>> >>>>> return __other_cpus_need_tlb_flush(mm) >>>>> } >>>>> >>>>> We should be able to deploy other_cpus_need_tlb_flush() in most of the >>>>> cases where we are doing "cpumask_any_but(mm_cpumask(mm), >>>>> smp_processor_id()) < nr_cpu_ids". >>>> >>>> IMO this is a bit nuts. smp_mb__after_atomic() doesn't do anything on >>>> x86. And, even if it did, why should the flush code assume that the >>>> previous store was atomic? >>>> >>>> What's the issue being fixed / worked around here? >>> >>> It does a compiler barrier, which prevents the decision whether a >>> remote TLB shootdown is required to be made before the PTE is set. >>> >>> I agree that PTEs may not be written atomically in certain cases >>> (although I am unaware of such cases, except on full-mm flush). >> >> How about plain set_pte? It's atomic (aligned word-sized write), but >> it's not atomic in the _after_atomic sense. > > Can you point me to a place where set_pte is used before a TLB > invalidation/shootdown, excluding this patch and the fullmm case? > > I am not claiming there is no such case, but I am unaware of such > one. PTEs are cleared on SMP using xchg, and similarly the dirty bit > is cleared with an atomic operation. > Hmm, you may be right. I still think this is all disgusting, but I don't have any better ideas. --Andy -- Andy Lutomirski AMA Capital Management, LLC ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] Linux VM workaround for Knights Landing A/D leak 2016-06-14 21:37 ` Dave Hansen 2016-06-15 2:20 ` Andy Lutomirski @ 2016-06-15 3:20 ` Nadav Amit 1 sibling, 0 replies; 38+ messages in thread From: Nadav Amit @ 2016-06-15 3:20 UTC (permalink / raw) To: Dave Hansen Cc: Lukasz Anaczkowski, LKML, linux-mm, Thomas Gleixner, Ingo Molnar, Andi Kleen, Kirill A. Shutemov, Michal Hocko, Andrew Morton, H. Peter Anvin, harish.srinivasappa, lukasz.odzioba, Andy Lutomirski Dave Hansen <dave.hansen@linux.intel.com> wrote: > On 06/14/2016 01:16 PM, Nadav Amit wrote: >> Dave Hansen <dave.hansen@linux.intel.com> wrote: >> >>> On 06/14/2016 09:47 AM, Nadav Amit wrote: >>>> Lukasz Anaczkowski <lukasz.anaczkowski@intel.com> wrote: >>>> >>>>>> From: Andi Kleen <ak@linux.intel.com> >>>>>> +void fix_pte_leak(struct mm_struct *mm, unsigned long addr, pte_t *ptep) >>>>>> +{ >>>> Here there should be a call to smp_mb__after_atomic() to synchronize with >>>> switch_mm. I submitted a similar patch, which is still pending (hint). >>>> >>>>>> + if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids) { >>>>>> + trace_tlb_flush(TLB_LOCAL_SHOOTDOWN, TLB_FLUSH_ALL); >>>>>> + flush_tlb_others(mm_cpumask(mm), mm, addr, >>>>>> + addr + PAGE_SIZE); >>>>>> + mb(); >>>>>> + set_pte(ptep, __pte(0)); >>>>>> + } >>>>>> +} >>> >>> Shouldn't that barrier be incorporated in the TLB flush code itself and >>> not every single caller (like this code is)? >>> >>> It is insane to require individual TLB flushers to be concerned with the >>> barriers. >> >> IMHO it is best to use existing flushing interfaces instead of creating >> new ones. > > Yeah, or make these things a _little_ harder to get wrong. That little > snippet above isn't so crazy that we should be depending on open-coded > barriers to get it right. > > Should we just add a barrier to mm_cpumask() itself? That should stop > the race. Or maybe we need a new primitive like: > > /* > * Call this if a full barrier has been executed since the last > * pagetable modification operation. > */ > static int __other_cpus_need_tlb_flush(struct mm_struct *mm) > { > /* cpumask_any_but() returns >= nr_cpu_ids if no cpus set. */ > return cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < > nr_cpu_ids; > } > > > static int other_cpus_need_tlb_flush(struct mm_struct *mm) > { > /* > * Synchronizes with switch_mm. Makes sure that we do not > * observe a bit having been cleared in mm_cpumask() before > * the other processor has seen our pagetable update. See > * switch_mm(). > */ > smp_mb__after_atomic(); > > return __other_cpus_need_tlb_flush(mm) > } > > We should be able to deploy other_cpus_need_tlb_flush() in most of the > cases where we are doing "cpumask_any_but(mm_cpumask(mm), > smp_processor_id()) < nr_cpu_ids". > > Right? This approach may work, but I doubt other_cpus_need_tlb_flush() would be needed by anyone, excluding this "hacky" workaround. There are already five interfaces for invalidation of: a single page, a userspace range, a whole task, a kernel range, and full flush including kernel (global) entries. > >> In theory, fix_pte_leak could have used flush_tlb_page. But the problem >> is that flush_tlb_page requires the vm_area_struct as an argument, which >> ptep_get_and_clear (and others) do not have. > > That, and we do not want/need to flush the _current_ processor's TLB. > flush_tlb_page() would have done that unnecessarily. That's not the end > of the world here, but it is a downside. Oops, I missed the fact a local flush is not needed in this case. Nadav ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] Linux VM workaround for Knights Landing A/D leak 2016-06-14 15:58 [PATCH] Linux VM workaround for Knights Landing A/D leak Lukasz Anaczkowski 2016-06-14 16:31 ` kbuild test robot 2016-06-14 16:47 ` Nadav Amit @ 2016-06-14 16:58 ` kbuild test robot 2016-06-14 17:19 ` Dave Hansen 2016-06-14 17:47 ` kbuild test robot 4 siblings, 0 replies; 38+ messages in thread From: kbuild test robot @ 2016-06-14 16:58 UTC (permalink / raw) To: Lukasz Anaczkowski Cc: kbuild-all, linux-kernel, linux-mm, tglx, mingo, dave.hansen, ak, kirill.shutemov, mhocko, akpm, hpa, lukasz.anaczkowski, harish.srinivasappa, lukasz.odzioba [-- Attachment #1: Type: text/plain, Size: 1403 bytes --] Hi, [auto build test ERROR on v4.7-rc3] [also build test ERROR on next-20160614] [cannot apply to tip/x86/core] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Lukasz-Anaczkowski/Linux-VM-workaround-for-Knights-Landing-A-D-leak/20160615-000610 config: i386-randconfig-s0-201624 (attached as .config) compiler: gcc-6 (Debian 6.1.1-1) 6.1.1 20160430 reproduce: # save the attached .config to linux build tree make ARCH=i386 All errors (new ones prefixed by >>): arch/x86/built-in.o: In function `__ptep_modify_prot_start': >> paravirt.c:(.text+0x346f4): undefined reference to `fix_pte_leak' mm/built-in.o: In function `unmap_page_range': >> (.text+0x1dd4c): undefined reference to `fix_pte_leak' mm/built-in.o: In function `move_page_tables': (.text+0x274f2): undefined reference to `fix_pte_leak' mm/built-in.o: In function `vunmap_page_range': >> vmalloc.c:(.text+0x2aa78): undefined reference to `fix_pte_leak' mm/built-in.o: In function `ptep_clear_flush': (.text+0x2cfa7): undefined reference to `fix_pte_leak' mm/built-in.o:(.text+0x30dff): more undefined references to `fix_pte_leak' follow --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation [-- Attachment #2: .config.gz --] [-- Type: application/octet-stream, Size: 26996 bytes --] ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] Linux VM workaround for Knights Landing A/D leak 2016-06-14 15:58 [PATCH] Linux VM workaround for Knights Landing A/D leak Lukasz Anaczkowski ` (2 preceding siblings ...) 2016-06-14 16:58 ` kbuild test robot @ 2016-06-14 17:19 ` Dave Hansen 2016-06-15 13:06 ` Anaczkowski, Lukasz 2016-06-14 17:47 ` kbuild test robot 4 siblings, 1 reply; 38+ messages in thread From: Dave Hansen @ 2016-06-14 17:19 UTC (permalink / raw) To: Lukasz Anaczkowski, linux-kernel, linux-mm, tglx, mingo, ak, kirill.shutemov, mhocko, akpm, hpa Cc: harish.srinivasappa, lukasz.odzioba ... > +extern void fix_pte_leak(struct mm_struct *mm, unsigned long addr, > + pte_t *ptep); > + > static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm, > unsigned long addr, pte_t *ptep) > { > - return ptep_get_and_clear(mm, addr, ptep); > + pte_t pte = ptep_get_and_clear(mm, addr, ptep); > + > + if (boot_cpu_has_bug(X86_BUG_PTE_LEAK)) > + fix_pte_leak(mm, addr, ptep); > + return pte; > } > > static inline void huge_ptep_clear_flush(struct vm_area_struct *vma, > diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h > index 1a27396..9769355 100644 > --- a/arch/x86/include/asm/pgtable.h > +++ b/arch/x86/include/asm/pgtable.h > @@ -794,11 +794,16 @@ extern int ptep_test_and_clear_young(struct vm_area_struct *vma, > extern int ptep_clear_flush_young(struct vm_area_struct *vma, > unsigned long address, pte_t *ptep); > > +extern void fix_pte_leak(struct mm_struct *mm, unsigned long addr, > + pte_t *ptep); > + > #define __HAVE_ARCH_PTEP_GET_AND_CLEAR > static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, > pte_t *ptep) > { > pte_t pte = native_ptep_get_and_clear(ptep); > + if (boot_cpu_has_bug(X86_BUG_PTE_LEAK)) > + fix_pte_leak(mm, addr, ptep); > pte_update(mm, addr, ptep); > return pte; > } Doesn't hugetlb.h somehow #include pgtable.h? So why double-define fix_pte_leak()? > diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h > index 2ee7811..6fa4079 100644 > --- a/arch/x86/include/asm/pgtable_64.h > +++ b/arch/x86/include/asm/pgtable_64.h > @@ -178,6 +178,12 @@ extern void cleanup_highmap(void); > extern void init_extra_mapping_uc(unsigned long phys, unsigned long size); > extern void init_extra_mapping_wb(unsigned long phys, unsigned long size); > > +#define ARCH_HAS_NEEDS_SWAP_PTL 1 > +static inline bool arch_needs_swap_ptl(void) > +{ > + return boot_cpu_has_bug(X86_BUG_PTE_LEAK); > +} > + > #endif /* !__ASSEMBLY__ */ I think we need a comment on this sucker. I'm not sure we should lean solely on the commit message to record why we need this until the end of time. Or, refer over to fix_pte_leak() for a full description of what is going on. > #endif /* _ASM_X86_PGTABLE_64_H */ > diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c > index 6e2ffbe..f499513 100644 > --- a/arch/x86/kernel/cpu/intel.c > +++ b/arch/x86/kernel/cpu/intel.c > @@ -181,6 +181,16 @@ static void early_init_intel(struct cpuinfo_x86 *c) > } > } > > + if (c->x86_model == 87) { > + static bool printed; > + > + if (!printed) { > + pr_info("Enabling PTE leaking workaround\n"); > + printed = true; > + } > + set_cpu_bug(c, X86_BUG_PTE_LEAK); > + } Please use the macros in here for the model id: > http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/tree/arch/x86/include/asm/intel-family.h We also probably want to prefix the pr_info() with something like "x86/intel:". > +/* > + * Workaround for KNL issue: Please be specific about what this "KNL issue" *is*. Refer to the public documentation of the erratum, please. > + * A thread that is going to page fault due to P=0, may still > + * non atomically set A or D bits, which could corrupt swap entries. > + * Always flush the other CPUs and clear the PTE again to avoid > + * this leakage. We are excluded using the pagetable lock. > + */ > + > +void fix_pte_leak(struct mm_struct *mm, unsigned long addr, pte_t *ptep) > +{ > + if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids) { > + trace_tlb_flush(TLB_LOCAL_SHOOTDOWN, TLB_FLUSH_ALL); > + flush_tlb_others(mm_cpumask(mm), mm, addr, > + addr + PAGE_SIZE); > + mb(); > + set_pte(ptep, __pte(0)); > + } > +} I think the comment here is a bit sparse. Can we add some more details, like: Entering here, the current CPU just cleared the PTE. But, another thread may have raced and set the A or D bits, or be _about_ to set the bits. Shooting their TLB entry down will ensure they see the cleared PTE and will not set A or D. and by the set_pte(): Clear the PTE one more time, in case the other thread set A/D before we sent the TLB flush. > #endif /* CONFIG_SMP */ > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 5df5feb..5c80fe09 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -2404,6 +2404,10 @@ static inline bool debug_guardpage_enabled(void) { return false; } > static inline bool page_is_guard(struct page *page) { return false; } > #endif /* CONFIG_DEBUG_PAGEALLOC */ > > +#ifndef ARCH_HAS_NEEDS_SWAP_PTL > +static inline bool arch_needs_swap_ptl(void) { return false; } > +#endif > + > #if MAX_NUMNODES > 1 > void __init setup_nr_node_ids(void); > #else > diff --git a/mm/memory.c b/mm/memory.c > index 15322b7..0d6ef39 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -1960,7 +1960,8 @@ static inline int pte_unmap_same(struct mm_struct *mm, pmd_t *pmd, > { > int same = 1; > #if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT) > - if (sizeof(pte_t) > sizeof(unsigned long)) { > + if (arch_needs_swap_ptl() || > + sizeof(pte_t) > sizeof(unsigned long)) { > spinlock_t *ptl = pte_lockptr(mm, pmd); > spin_lock(ptl); > same = pte_same(*page_table, orig_pte); > ^ permalink raw reply [flat|nested] 38+ messages in thread
* RE: [PATCH] Linux VM workaround for Knights Landing A/D leak 2016-06-14 17:19 ` Dave Hansen @ 2016-06-15 13:06 ` Anaczkowski, Lukasz 0 siblings, 0 replies; 38+ messages in thread From: Anaczkowski, Lukasz @ 2016-06-15 13:06 UTC (permalink / raw) To: Dave Hansen, linux-kernel, linux-mm, tglx, mingo, ak, kirill.shutemov, mhocko, akpm, hpa Cc: Srinivasappa, Harish, Odzioba, Lukasz From: Dave Hansen [mailto:dave.hansen@linux.intel.com] Sent: Tuesday, June 14, 2016 7:20 PM >> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h ... >> +extern void fix_pte_leak(struct mm_struct *mm, unsigned long addr, >> + pte_t *ptep); > Doesn't hugetlb.h somehow #include pgtable.h? So why double-define > fix_pte_leak()? It's other way round - pgtable.h somehow includes hugetlb.h. I've removed duplicated fix_pte_leak() declaration. >> diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h >> index 2ee7811..6fa4079 100644 >> --- a/arch/x86/include/asm/pgtable_64.h >> +++ b/arch/x86/include/asm/pgtable_64.h >> @@ -178,6 +178,12 @@ extern void cleanup_highmap(void); >> extern void init_extra_mapping_uc(unsigned long phys, unsigned long size); >> extern void init_extra_mapping_wb(unsigned long phys, unsigned long size); >> >> +#define ARCH_HAS_NEEDS_SWAP_PTL 1 >> +static inline bool arch_needs_swap_ptl(void) >> +{ >> + return boot_cpu_has_bug(X86_BUG_PTE_LEAK); >> +} >> + >> #endif /* !__ASSEMBLY__ */ > I think we need a comment on this sucker. I'm not sure we should lean > solely on the commit message to record why we need this until the end of > time. OK. >> + if (c->x86_model == 87) { > Please use the macros in here for the model id: OK. > http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/tree/arch/x86/include/asm/intel-family.h > We also probably want to prefix the pr_info() with something like > "x86/intel:". OK >> +/* >> + * Workaround for KNL issue: > Please be specific about what this "KNL issue" *is*. OK >> + * A thread that is going to page fault due to P=0, may still >> + * non atomically set A or D bits, which could corrupt swap entries. >> + * Always flush the other CPUs and clear the PTE again to avoid >> + * this leakage. We are excluded using the pagetable lock. >> + */ >> + >> +void fix_pte_leak(struct mm_struct *mm, unsigned long addr, pte_t *ptep) >> +{ >> + if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids) { >> + trace_tlb_flush(TLB_LOCAL_SHOOTDOWN, TLB_FLUSH_ALL); >> + flush_tlb_others(mm_cpumask(mm), mm, addr, >> + addr + PAGE_SIZE); >> + mb(); >> + set_pte(ptep, __pte(0)); >> + } >> +} > > I think the comment here is a bit sparse. Can we add some more details, > like: > > Entering here, the current CPU just cleared the PTE. But, > another thread may have raced and set the A or D bits, or be > _about_ to set the bits. Shooting their TLB entry down will > ensure they see the cleared PTE and will not set A or D. > > and by the set_pte(): > > Clear the PTE one more time, in case the other thread set A/D > before we sent the TLB flush. Thanks, Lukasz ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] Linux VM workaround for Knights Landing A/D leak 2016-06-14 15:58 [PATCH] Linux VM workaround for Knights Landing A/D leak Lukasz Anaczkowski ` (3 preceding siblings ...) 2016-06-14 17:19 ` Dave Hansen @ 2016-06-14 17:47 ` kbuild test robot 4 siblings, 0 replies; 38+ messages in thread From: kbuild test robot @ 2016-06-14 17:47 UTC (permalink / raw) To: Lukasz Anaczkowski Cc: kbuild-all, linux-kernel, linux-mm, tglx, mingo, dave.hansen, ak, kirill.shutemov, mhocko, akpm, hpa, lukasz.anaczkowski, harish.srinivasappa, lukasz.odzioba [-- Attachment #1: Type: text/plain, Size: 1423 bytes --] Hi, [auto build test ERROR on v4.7-rc3] [also build test ERROR on next-20160614] [cannot apply to tip/x86/core] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Lukasz-Anaczkowski/Linux-VM-workaround-for-Knights-Landing-A-D-leak/20160615-000610 config: i386-randconfig-r0-201624 (attached as .config) compiler: gcc-6 (Debian 6.1.1-1) 6.1.1 20160430 reproduce: # save the attached .config to linux build tree make ARCH=i386 All errors (new ones prefixed by >>): arch/x86/built-in.o: In function `__ptep_modify_prot_start': paravirt.c:(.text+0xc0e3c): undefined reference to `fix_pte_leak' mm/built-in.o: In function `zap_pte_range': >> memory.c:(.text+0xdb8b4): undefined reference to `fix_pte_leak' mm/built-in.o: In function `move_page_tables': (.text+0x1102ff): undefined reference to `fix_pte_leak' mm/built-in.o: In function `vunmap_page_range': vmalloc.c:(.text+0x125741): undefined reference to `fix_pte_leak' mm/built-in.o: In function `ptep_clear_flush': (.text+0x12e01f): undefined reference to `fix_pte_leak' mm/built-in.o:madvise.c:(.text+0x131138): more undefined references to `fix_pte_leak' follow --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation [-- Attachment #2: .config.gz --] [-- Type: application/octet-stream, Size: 28300 bytes --] ^ permalink raw reply [flat|nested] 38+ messages in thread
end of thread, other threads:[~2016-06-16 20:24 UTC | newest] Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-06-14 15:58 [PATCH] Linux VM workaround for Knights Landing A/D leak Lukasz Anaczkowski 2016-06-14 16:31 ` kbuild test robot 2016-06-14 16:47 ` Nadav Amit 2016-06-14 16:54 ` Anaczkowski, Lukasz 2016-06-14 17:01 ` [PATCH v2] " Lukasz Anaczkowski 2016-06-14 17:24 ` Dave Hansen 2016-06-14 18:34 ` One Thousand Gnomes 2016-06-14 18:54 ` Dave Hansen 2016-06-14 19:19 ` Borislav Petkov 2016-06-14 20:20 ` H. Peter Anvin 2016-06-14 20:47 ` Borislav Petkov 2016-06-14 20:54 ` H. Peter Anvin 2016-06-14 21:02 ` Borislav Petkov 2016-06-14 21:08 ` H. Peter Anvin 2016-06-14 21:13 ` H. Peter Anvin 2016-06-14 18:10 ` Borislav Petkov 2016-06-15 13:12 ` Anaczkowski, Lukasz 2016-06-14 18:38 ` Nadav Amit 2016-06-15 13:12 ` Anaczkowski, Lukasz 2016-06-15 20:04 ` Nadav Amit 2016-06-15 20:10 ` Dave Hansen 2016-06-15 20:26 ` Nadav Amit 2016-06-16 15:14 ` [PATCH v3] " Lukasz Anaczkowski 2016-06-16 16:43 ` Nadav Amit 2016-06-16 20:23 ` Dave Hansen 2016-06-14 17:18 ` [PATCH] " Dave Hansen 2016-06-14 20:16 ` Nadav Amit 2016-06-14 21:37 ` Dave Hansen 2016-06-15 2:20 ` Andy Lutomirski 2016-06-15 2:35 ` Nadav Amit 2016-06-15 2:36 ` Andy Lutomirski 2016-06-15 2:44 ` Nadav Amit 2016-06-15 3:09 ` Andy Lutomirski 2016-06-15 3:20 ` Nadav Amit 2016-06-14 16:58 ` kbuild test robot 2016-06-14 17:19 ` Dave Hansen 2016-06-15 13:06 ` Anaczkowski, Lukasz 2016-06-14 17:47 ` kbuild test robot
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).