From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752325AbaJQOLb (ORCPT ); Fri, 17 Oct 2014 10:11:31 -0400 Received: from e06smtp11.uk.ibm.com ([195.75.94.107]:57956 "EHLO e06smtp11.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752442AbaJQOKN (ORCPT ); Fri, 17 Oct 2014 10:10:13 -0400 From: Dominik Dingel To: Andrew Morton , linux-mm@kvack.org, Mel Gorman , Michal Hocko , Rik van Riel Cc: Andrea Arcangeli , Andy Lutomirski , "Aneesh Kumar K.V" , Bob Liu , Christian Borntraeger , Cornelia Huck , Gleb Natapov , Heiko Carstens , "H. Peter Anvin" , Hugh Dickins , Ingo Molnar , Jianyu Zhan , Johannes Weiner , "Kirill A. Shutemov" , Konstantin Weitz , kvm@vger.kernel.org, linux390@de.ibm.com, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, Martin Schwidefsky , Paolo Bonzini , Peter Zijlstra , Sasha Levin , Dominik Dingel Subject: [PATCH 3/4] s390/mm: prevent and break zero page mappings in case of storage keys Date: Fri, 17 Oct 2014 16:09:49 +0200 Message-Id: <1413554990-48512-4-git-send-email-dingel@linux.vnet.ibm.com> X-Mailer: git-send-email 1.8.5.5 In-Reply-To: <1413554990-48512-1-git-send-email-dingel@linux.vnet.ibm.com> References: <1413554990-48512-1-git-send-email-dingel@linux.vnet.ibm.com> X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 14101714-0005-0000-0000-000001B395B2 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org As soon as storage keys are enabled we need to work around of zero page mappings to prevent inconsistencies between storage keys and pgste. Otherwise following data corruption could happen: 1) guest enables storage key 2) guest sets storage key for not mapped page X -> change goes to PGSTE 3) guest reads from page X -> as X was not dirty before, the page will be zero page backed, storage key from PGSTE for X will go to storage key for zero page 4) guest sets storage key for not mapped page Y (same logic as above 5) guest reads from page Y -> as Y was not dirty before, the page will be zero page backed, storage key from PGSTE for Y will got to storage key for zero page overwriting storage key for X While holding the mmap sem, we are safe before changes on entries we already fixed. As sske and host large pages are also mutual exclusive we do not even need to retry the fixup_user_fault. Signed-off-by: Dominik Dingel Acked-by: Christian Borntraeger Signed-off-by: Martin Schwidefsky --- arch/s390/Kconfig | 3 +++ arch/s390/mm/pgtable.c | 15 +++++++++++++++ 2 files changed, 18 insertions(+) diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index 05c78bb..4e04e63 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -1,6 +1,9 @@ config MMU def_bool y +config NOZEROPAGE + def_bool y + config ZONE_DMA def_bool y diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c index ab55ba8..6321692 100644 --- a/arch/s390/mm/pgtable.c +++ b/arch/s390/mm/pgtable.c @@ -1309,6 +1309,15 @@ static int __s390_enable_skey(pte_t *pte, unsigned long addr, pgste_t pgste; pgste = pgste_get_lock(pte); + /* + * Remove all zero page mappings, + * after establishing a policy to forbid zero page mappings + * following faults for that page will get fresh anonymous pages + */ + if (is_zero_pfn(pte_pfn(*pte))) { + ptep_flush_direct(walk->mm, addr, pte); + pte_val(*pte) = _PAGE_INVALID; + } /* Clear storage key */ pgste_val(pgste) &= ~(PGSTE_ACC_BITS | PGSTE_FP_BIT | PGSTE_GR_BIT | PGSTE_GC_BIT); @@ -1323,10 +1332,16 @@ void s390_enable_skey(void) { struct mm_walk walk = { .pte_entry = __s390_enable_skey }; struct mm_struct *mm = current->mm; + struct vm_area_struct *vma; down_write(&mm->mmap_sem); if (mm_use_skey(mm)) goto out_up; + + for (vma = mm->mmap; vma; vma = vma->vm_next) + vma->vm_flags |= VM_NOZEROPAGE; + mm->def_flags |= VM_NOZEROPAGE; + walk.mm = mm; walk_page_range(0, TASK_SIZE, &walk); mm->context.use_skey = 1; -- 1.8.5.5