From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932330AbcF3PYe (ORCPT ); Thu, 30 Jun 2016 11:24:34 -0400 Received: from mail-vk0-f43.google.com ([209.85.213.43]:33831 "EHLO mail-vk0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932176AbcF3PYd (ORCPT ); Thu, 30 Jun 2016 11:24:33 -0400 MIME-Version: 1.0 In-Reply-To: References: <20160617105435.GB15997@pd.tnic> <1735047.Yzv12qmPPB@vostro.rjw.lan> <1735143.jkZ4XHok9E@vostro.rjw.lan> <20160630150539.GA3962@pd.tnic> From: Andy Lutomirski Date: Thu, 30 Jun 2016 08:24:02 -0700 Message-ID: Subject: Re: [PATCH v4] x86/power/64: Fix kernel text mapping corruption during image restoration To: "Rafael J. Wysocki" Cc: Borislav Petkov , "Rafael J. Wysocki" , Logan Gunthorpe , Kees Cook , Linus Torvalds , Thomas Gleixner , Ingo Molnar , Peter Zijlstra , lkml , "Rafael J. Wysocki" , Andy Lutomirski , Brian Gerst , Denys Vlasenko , "H. Peter Anvin" , Linux PM list , Stephen Smalley Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 30, 2016 at 8:17 AM, Rafael J. Wysocki wrote: > On Thu, Jun 30, 2016 at 5:05 PM, Borislav Petkov wrote: >> On Thu, Jun 30, 2016 at 03:17:20PM +0200, Rafael J. Wysocki wrote: >>> From: Rafael J. Wysocki >>> >>> Logan Gunthorpe reports that hibernation stopped working reliably for >>> him after commit ab76f7b4ab23 (x86/mm: Set NX on gap between __ex_table >>> and rodata). >> >> ... >> >>> +static int relocate_restore_code(void) >>> +{ >>> + pgd_t *pgd; >>> + pud_t *pud; >>> + >>> + relocated_restore_code = get_safe_page(GFP_ATOMIC); >>> + if (!relocated_restore_code) >>> + return -ENOMEM; >>> + >>> + memcpy((void *)relocated_restore_code, &core_restore_code, PAGE_SIZE); >>> + >>> + /* Make the page containing the relocated code executable */ >>> + pgd = (pgd_t *)__va(read_cr3()) + pgd_index(relocated_restore_code); >>> + pud = pud_offset(pgd, relocated_restore_code); >>> + if (pud_large(*pud)) { >>> + set_pud(pud, __pud(pud_val(*pud) & ~_PAGE_NX)); >>> + } else { >>> + pmd_t *pmd = pmd_offset(pud, relocated_restore_code); >>> + >>> + if (pmd_large(*pmd)) { >>> + set_pmd(pmd, __pmd(pmd_val(*pmd) & ~_PAGE_NX)); >>> + } else { >>> + pte_t *pte = pte_offset_kernel(pmd, relocated_restore_code); >>> + >>> + set_pte(pte, __pte(pte_val(*pte) & ~_PAGE_NX)); >>> + } >>> + } >>> + flush_tlb_all(); >> >> I know you want to flush TLBs but this causes the splat below on the >> resume kernel. >> >> Most likely because: >> >> resume_target_kernel() does local_irq_disable() and then >> >> swsusp_arch_resume() -> relocate_restore_code() -> flush_tlb_all() >> >> and smp_call_function_many() doesn't like it when IRQs are disabled. > > Right. > > Can I invoke __flush_tlb_all() from here to flush the TLB on the local > CPU? That should be sufficient IMO. Yes, unless there's another CPU already up at the time, but that seems unlikely. FWIW, the CPU can and will cache the NX bit and the CPU will *not* refresh the TLB before sending a page fault. You definitely need to flush. I wonder if this is also why you're seeing a certain amount of random corruption. --Andy