From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751699AbcF3RXg (ORCPT ); Thu, 30 Jun 2016 13:23:36 -0400 Received: from mail-vk0-f54.google.com ([209.85.213.54]:36315 "EHLO mail-vk0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750825AbcF3RXf (ORCPT ); Thu, 30 Jun 2016 13:23:35 -0400 MIME-Version: 1.0 In-Reply-To: References: <20160617105435.GB15997@pd.tnic> <1735047.Yzv12qmPPB@vostro.rjw.lan> <1735143.jkZ4XHok9E@vostro.rjw.lan> <20160630150539.GA3962@pd.tnic> From: Andy Lutomirski Date: Thu, 30 Jun 2016 10:23:14 -0700 Message-ID: Subject: Re: [PATCH v4] x86/power/64: Fix kernel text mapping corruption during image restoration To: "Rafael J. Wysocki" Cc: Borislav Petkov , "Rafael J. Wysocki" , Logan Gunthorpe , Kees Cook , Linus Torvalds , Thomas Gleixner , Ingo Molnar , Peter Zijlstra , lkml , "Rafael J. Wysocki" , Andy Lutomirski , Brian Gerst , Denys Vlasenko , "H. Peter Anvin" , Linux PM list , Stephen Smalley Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 30, 2016 at 8:29 AM, Rafael J. Wysocki wrote: > On Thu, Jun 30, 2016 at 5:24 PM, Andy Lutomirski wrote: >> On Thu, Jun 30, 2016 at 8:17 AM, Rafael J. Wysocki wrote: >>> On Thu, Jun 30, 2016 at 5:05 PM, Borislav Petkov wrote: >>>> On Thu, Jun 30, 2016 at 03:17:20PM +0200, Rafael J. Wysocki wrote: >>>>> From: Rafael J. Wysocki >>>>> >>>>> Logan Gunthorpe reports that hibernation stopped working reliably for >>>>> him after commit ab76f7b4ab23 (x86/mm: Set NX on gap between __ex_table >>>>> and rodata). >>>> >>>> ... >>>> >>>>> +static int relocate_restore_code(void) >>>>> +{ >>>>> + pgd_t *pgd; >>>>> + pud_t *pud; >>>>> + >>>>> + relocated_restore_code = get_safe_page(GFP_ATOMIC); >>>>> + if (!relocated_restore_code) >>>>> + return -ENOMEM; >>>>> + >>>>> + memcpy((void *)relocated_restore_code, &core_restore_code, PAGE_SIZE); >>>>> + >>>>> + /* Make the page containing the relocated code executable */ >>>>> + pgd = (pgd_t *)__va(read_cr3()) + pgd_index(relocated_restore_code); >>>>> + pud = pud_offset(pgd, relocated_restore_code); >>>>> + if (pud_large(*pud)) { >>>>> + set_pud(pud, __pud(pud_val(*pud) & ~_PAGE_NX)); >>>>> + } else { >>>>> + pmd_t *pmd = pmd_offset(pud, relocated_restore_code); >>>>> + >>>>> + if (pmd_large(*pmd)) { >>>>> + set_pmd(pmd, __pmd(pmd_val(*pmd) & ~_PAGE_NX)); >>>>> + } else { >>>>> + pte_t *pte = pte_offset_kernel(pmd, relocated_restore_code); >>>>> + >>>>> + set_pte(pte, __pte(pte_val(*pte) & ~_PAGE_NX)); >>>>> + } >>>>> + } >>>>> + flush_tlb_all(); >>>> >>>> I know you want to flush TLBs but this causes the splat below on the >>>> resume kernel. >>>> >>>> Most likely because: >>>> >>>> resume_target_kernel() does local_irq_disable() and then >>>> >>>> swsusp_arch_resume() -> relocate_restore_code() -> flush_tlb_all() >>>> >>>> and smp_call_function_many() doesn't like it when IRQs are disabled. >>> >>> Right. >>> >>> Can I invoke __flush_tlb_all() from here to flush the TLB on the local >>> CPU? That should be sufficient IMO. >> >> Yes, unless there's another CPU already up at the time, but that seems unlikely. > > They are all offline at that time. > > OK, let me do that then. > >> FWIW, the CPU can and will cache the NX bit and the CPU will *not* >> refresh the TLB before sending a page fault. You definitely need to >> flush. > > OK > >> I wonder if this is also why you're seeing a certain amount of >> random corruption. > > Well, the corruption we saw was not really random. I can give you all > of the details if you're interested, maybe you'll be able to figure > out what might be happening. :-) Sure, or at least in abbreviated form. No guarantee that I'll have any clue, though. --Andy