From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B955FC4338F for ; Mon, 9 Aug 2021 09:28:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9B0ED60EE7 for ; Mon, 9 Aug 2021 09:28:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234502AbhHIJ25 (ORCPT ); Mon, 9 Aug 2021 05:28:57 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:21072 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234509AbhHIJ2y (ORCPT ); Mon, 9 Aug 2021 05:28:54 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1628501313; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZLaOMk+c1094E7Yrcw4rrXy66jCTCdmINmxO44DWzIw=; b=cJNq95gUtb/j4GrFiZh0Etr0Lm3sx+ig68CgOeO6CZAkpvVP+YdVs7iul4iD23NLDsoBl5 TcnIi4Y6kZy8YQreOJa9mcV2xOE28vPARcUB5t4g7+NxGYrhNf4XxWGZ8Gd6qkDsbdx85H lIthQanbuA+Ya+V2Dzk9gN7vcTUiAaU= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-294-GzaJcT-AMvebSHOXIiVUuA-1; Mon, 09 Aug 2021 05:28:30 -0400 X-MC-Unique: GzaJcT-AMvebSHOXIiVUuA-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 6D846DF8A5; Mon, 9 Aug 2021 09:28:28 +0000 (UTC) Received: from gshan.redhat.com (vpn2-54-155.bne.redhat.com [10.64.54.155]) by smtp.corp.redhat.com (Postfix) with ESMTPS id DBC5D5D6CF; Mon, 9 Aug 2021 09:28:22 +0000 (UTC) From: Gavin Shan To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, anshuman.khandual@arm.com, gerald.schaefer@linux.ibm.com, aneesh.kumar@linux.ibm.com, christophe.leroy@csgroup.eu, cai@lca.pw, catalin.marinas@arm.com, will@kernel.org, vgupta@synopsys.com, akpm@linux-foundation.org, chuhu@redhat.com, shan.gavin@gmail.com Subject: [PATCH v6 12/12] mm/debug_vm_pgtable: Fix corrupted page flag Date: Mon, 9 Aug 2021 17:26:31 +0800 Message-Id: <20210809092631.1888748-13-gshan@redhat.com> In-Reply-To: <20210809092631.1888748-1-gshan@redhat.com> References: <20210809092631.1888748-1-gshan@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org In page table entry modifying tests, set_xxx_at() are used to populate the page table entries. On ARM64, PG_arch_1 (PG_dcache_clean) flag is set to the target page flag if execution permission is given. The logic exits since commit 4f04d8f00545 ("arm64: MMU definitions"). The page flag is kept when the page is free'd to buddy's free area list. However, it will trigger page checking failure when it's pulled from the buddy's free area list, as the following warning messages indicate. BUG: Bad page state in process memhog pfn:08000 page:0000000015c0a628 refcount:0 mapcount:0 \ mapping:0000000000000000 index:0x1 pfn:0x8000 flags: 0x7ffff8000000800(arch_1|node=0|zone=0|lastcpupid=0xfffff) raw: 07ffff8000000800 dead000000000100 dead000000000122 0000000000000000 raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: PAGE_FLAGS_CHECK_AT_PREP flag(s) set This fixes the issue by clearing PG_arch_1 through flush_dcache_page() after set_xxx_at() is called. For architectures other than ARM64, the unexpected overhead of cache flushing is acceptable. Fixes: a5c3b9ffb0f4 ("mm/debug_vm_pgtable: add tests validating advanced arch page table helpers") Signed-off-by: Gavin Shan Reviewed-by: Anshuman Khandual --- mm/debug_vm_pgtable.c | 55 +++++++++++++++++++++++++++++++++++++++---- 1 file changed, 51 insertions(+), 4 deletions(-) diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c index 4409e30adcf6..1403639302e4 100644 --- a/mm/debug_vm_pgtable.c +++ b/mm/debug_vm_pgtable.c @@ -29,6 +29,8 @@ #include #include #include + +#include #include #include @@ -119,19 +121,28 @@ static void __init pte_basic_tests(struct pgtable_debug_args *args, int idx) static void __init pte_advanced_tests(struct pgtable_debug_args *args) { + struct page *page; pte_t pte; /* * Architectures optimize set_pte_at by avoiding TLB flush. * This requires set_pte_at to be not used to update an * existing pte entry. Clear pte before we do set_pte_at + * + * flush_dcache_page() is called after set_pte_at() to clear + * PG_arch_1 for the page on ARM64. The page flag isn't cleared + * when it's released and page allocation check will fail when + * the page is allocated again. For architectures other than ARM64, + * the unexpected overhead of cache flushing is acceptable. */ - if (args->pte_pfn == ULONG_MAX) + page = (args->pte_pfn != ULONG_MAX) ? pfn_to_page(args->pte_pfn) : NULL; + if (!page) return; pr_debug("Validating PTE advanced\n"); pte = pfn_pte(args->pte_pfn, args->page_prot); set_pte_at(args->mm, args->vaddr, args->ptep, pte); + flush_dcache_page(page); ptep_set_wrprotect(args->mm, args->vaddr, args->ptep); pte = ptep_get(args->ptep); WARN_ON(pte_write(pte)); @@ -143,6 +154,7 @@ static void __init pte_advanced_tests(struct pgtable_debug_args *args) pte = pte_wrprotect(pte); pte = pte_mkclean(pte); set_pte_at(args->mm, args->vaddr, args->ptep, pte); + flush_dcache_page(page); pte = pte_mkwrite(pte); pte = pte_mkdirty(pte); ptep_set_access_flags(args->vma, args->vaddr, args->ptep, pte, 1); @@ -155,6 +167,7 @@ static void __init pte_advanced_tests(struct pgtable_debug_args *args) pte = pfn_pte(args->pte_pfn, args->page_prot); pte = pte_mkyoung(pte); set_pte_at(args->mm, args->vaddr, args->ptep, pte); + flush_dcache_page(page); ptep_test_and_clear_young(args->vma, args->vaddr, args->ptep); pte = ptep_get(args->ptep); WARN_ON(pte_young(pte)); @@ -213,15 +226,24 @@ static void __init pmd_basic_tests(struct pgtable_debug_args *args, int idx) static void __init pmd_advanced_tests(struct pgtable_debug_args *args) { + struct page *page; pmd_t pmd; unsigned long vaddr = args->vaddr; if (!has_transparent_hugepage()) return; - if (args->pmd_pfn == ULONG_MAX) + page = (args->pmd_pfn != ULONG_MAX) ? pfn_to_page(args->pmd_pfn) : NULL; + if (!page) return; + /* + * flush_dcache_page() is called after set_pmd_at() to clear + * PG_arch_1 for the page on ARM64. The page flag isn't cleared + * when it's released and page allocation check will fail when + * the page is allocated again. For architectures other than ARM64, + * the unexpected overhead of cache flushing is acceptable. + */ pr_debug("Validating PMD advanced\n"); /* Align the address wrt HPAGE_PMD_SIZE */ vaddr &= HPAGE_PMD_MASK; @@ -230,6 +252,7 @@ static void __init pmd_advanced_tests(struct pgtable_debug_args *args) pmd = pfn_pmd(args->pmd_pfn, args->page_prot); set_pmd_at(args->mm, vaddr, args->pmdp, pmd); + flush_dcache_page(page); pmdp_set_wrprotect(args->mm, vaddr, args->pmdp); pmd = READ_ONCE(*args->pmdp); WARN_ON(pmd_write(pmd)); @@ -241,6 +264,7 @@ static void __init pmd_advanced_tests(struct pgtable_debug_args *args) pmd = pmd_wrprotect(pmd); pmd = pmd_mkclean(pmd); set_pmd_at(args->mm, vaddr, args->pmdp, pmd); + flush_dcache_page(page); pmd = pmd_mkwrite(pmd); pmd = pmd_mkdirty(pmd); pmdp_set_access_flags(args->vma, vaddr, args->pmdp, pmd, 1); @@ -253,6 +277,7 @@ static void __init pmd_advanced_tests(struct pgtable_debug_args *args) pmd = pmd_mkhuge(pfn_pmd(args->pmd_pfn, args->page_prot)); pmd = pmd_mkyoung(pmd); set_pmd_at(args->mm, vaddr, args->pmdp, pmd); + flush_dcache_page(page); pmdp_test_and_clear_young(args->vma, vaddr, args->pmdp); pmd = READ_ONCE(*args->pmdp); WARN_ON(pmd_young(pmd)); @@ -339,21 +364,31 @@ static void __init pud_basic_tests(struct pgtable_debug_args *args, int idx) static void __init pud_advanced_tests(struct pgtable_debug_args *args) { + struct page *page; unsigned long vaddr = args->vaddr; pud_t pud; if (!has_transparent_hugepage()) return; - if (args->pud_pfn == ULONG_MAX) + page = (args->pud_pfn != ULONG_MAX) ? pfn_to_page(args->pud_pfn) : NULL; + if (!page) return; + /* + * flush_dcache_page() is called after set_pud_at() to clear + * PG_arch_1 for the page on ARM64. The page flag isn't cleared + * when it's released and page allocation check will fail when + * the page is allocated again. For architectures other than ARM64, + * the unexpected overhead of cache flushing is acceptable. + */ pr_debug("Validating PUD advanced\n"); /* Align the address wrt HPAGE_PUD_SIZE */ vaddr &= HPAGE_PUD_MASK; pud = pfn_pud(args->pud_pfn, args->page_prot); set_pud_at(args->mm, vaddr, args->pudp, pud); + flush_dcache_page(page); pudp_set_wrprotect(args->mm, vaddr, args->pudp); pud = READ_ONCE(*args->pudp); WARN_ON(pud_write(pud)); @@ -367,6 +402,7 @@ static void __init pud_advanced_tests(struct pgtable_debug_args *args) pud = pud_wrprotect(pud); pud = pud_mkclean(pud); set_pud_at(args->mm, vaddr, args->pudp, pud); + flush_dcache_page(page); pud = pud_mkwrite(pud); pud = pud_mkdirty(pud); pudp_set_access_flags(args->vma, vaddr, args->pudp, pud, 1); @@ -382,6 +418,7 @@ static void __init pud_advanced_tests(struct pgtable_debug_args *args) pud = pfn_pud(args->pud_pfn, args->page_prot); pud = pud_mkyoung(pud); set_pud_at(args->mm, vaddr, args->pudp, pud); + flush_dcache_page(page); pudp_test_and_clear_young(args->vma, vaddr, args->pudp); pud = READ_ONCE(*args->pudp); WARN_ON(pud_young(pud)); @@ -594,16 +631,26 @@ static void __init pgd_populate_tests(struct pgtable_debug_args *args) { } static void __init pte_clear_tests(struct pgtable_debug_args *args) { + struct page *page; pte_t pte = pfn_pte(args->pte_pfn, args->page_prot); - if (args->pte_pfn == ULONG_MAX) + page = (args->pte_pfn != ULONG_MAX) ? pfn_to_page(args->pte_pfn) : NULL; + if (!page) return; + /* + * flush_dcache_page() is called after set_pte_at() to clear + * PG_arch_1 for the page on ARM64. The page flag isn't cleared + * when it's released and page allocation check will fail when + * the page is allocated again. For architectures other than ARM64, + * the unexpected overhead of cache flushing is acceptable. + */ pr_debug("Validating PTE clear\n"); #ifndef CONFIG_RISCV pte = __pte(pte_val(pte) | RANDOM_ORVALUE); #endif set_pte_at(args->mm, args->vaddr, args->ptep, pte); + flush_dcache_page(page); barrier(); pte_clear(args->mm, args->vaddr, args->ptep); pte = ptep_get(args->ptep); -- 2.23.0