From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 97AB6C43381 for ; Tue, 16 Mar 2021 06:57:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 540CF65245 for ; Tue, 16 Mar 2021 06:57:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235774AbhCPG5Q (ORCPT ); Tue, 16 Mar 2021 02:57:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38676 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235764AbhCPG5L (ORCPT ); Tue, 16 Mar 2021 02:57:11 -0400 Received: from mail-vk1-xa30.google.com (mail-vk1-xa30.google.com [IPv6:2607:f8b0:4864:20::a30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F4002C06174A for ; Mon, 15 Mar 2021 23:57:10 -0700 (PDT) Received: by mail-vk1-xa30.google.com with SMTP id o85so3375720vko.8 for ; Mon, 15 Mar 2021 23:57:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=jiuyang-me.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ZLahKG9UcbgtTUmPhUD6q014SzN8CK5qg8XOd8M0jpg=; b=jwqgfaagcrNOtg7rRAvzlG1mIxL4jPuuAgwmxdPhJDWh1wQw0aNL9B2NdNMJismGdD Cv3E2fMW/iRat/JY3i+1DBalATOpz9bpAtL42UMd0SPEPBVxQgcLxdRUIlccXikEcc8s mrngsbbAzMJaup+Sc/MNZF7sWgL9actraO6WBxaSIzS1+uUg1L4PE1Fnw5OGXXoxK4E9 rs/QXPHfMlpfibBLi/2PvgjyVlpzcclPVbN/NpkOe9t0G9tvLQ7bTTYi2bcsAAE3H0gd KSCIqEINadP8JIQu5BGBN0jWYV1wCTX7fhT2k5efbvYhj6vVMxa7W//lk8Cx5yXC310I DhNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ZLahKG9UcbgtTUmPhUD6q014SzN8CK5qg8XOd8M0jpg=; b=itU4GVf+U2u79MrnV8iCIUmflno82nfkFT4E8ceqsYIDgxn+4JUSRCAjUQn/CVJyKl Nb4F+DnmYyeLnlycfdE1fKrWUF/L6SBDtGrFbGAzOG7JinG4dkjywx4489iL2oc9oCM4 KU4FExnfIboxUXQMOy79w11LGt7KmBCNInB1Cz2OIiEQu9TCYN/vvs2+P8sXbA8TCrfx u3XEccgrc5uY5kf9SVWK74ld8+o6mWga/jfNL00WbyYLXYScJOXhE+DeaxgXgi0RadIs xPtsFT3EM4Mx3G2WYIEsCgJjLNRs3z8ocv2/GgoopDrsZqvryyvW3t1kOZIH42ni5xVb Q3Dg== X-Gm-Message-State: AOAM533nPerBHKPdThs68pMxaArQs2mFSd2yO20bA53mlFf4CIrhHj9c qtA2pl/7NLEBR7EelT0bCJJDQpdGit7CKVTxWJUG1g== X-Google-Smtp-Source: ABdhPJw/BxdgCSXGvjRU6Xhi3tDr+hELxBBfx9wkHd+qNfINtnzICGtMRohVC2//hnbh5l15mKxZ5vtK/VJU2zOQpbg= X-Received: by 2002:a1f:a4d5:: with SMTP id n204mr3940314vke.0.1615877830049; Mon, 15 Mar 2021 23:57:10 -0700 (PDT) MIME-Version: 1.0 References: <20210316015328.13516-1-liu@jiuyang.me> <20210316034638.16276-1-liu@jiuyang.me> In-Reply-To: From: Jiuyang Liu Date: Tue, 16 Mar 2021 06:56:58 +0000 Message-ID: Subject: Re: [PATCH] Insert SFENCE.VMA in function set_pte_at for RISCV To: Anup Patel Cc: Alexandre Ghiti , Andrew Waterman , Paul Walmsley , Palmer Dabbelt , Albert Ou , Atish Patra , Anup Patel , Andrew Morton , Mike Rapoport , Kefeng Wang , Zong Li , Greentime Hu , linux-riscv , "linux-kernel@vger.kernel.org List" Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > As per my understanding, we don't need to explicitly invalidate local TLB > in set_pte() or set_pet_at() because generic Linux page table management > (/mm/*) will call the appropriate flush_tlb_xyz() function after page > table updates. I witnessed this bug in our micro-architecture: set_pte instruction is still in the store buffer, no functions are inserting SFENCE.VMA in the stack below, so TLB cannot witness this modification. Here is my call stack: set_pte set_pte_at map_vm_area __vmalloc_area_node __vmalloc_node_range __vmalloc_node __vmalloc_node_flags vzalloc n_tty_open I think this is an architecture specific code, so /mm/* should not be modified. And spec requires SFENCE.VMA to be inserted on each modification to TLB. So I added code here. > Also, just local TLB flush is generally not sufficient because > a lot of page tables will be used across on multiple HARTs. Yes, this is the biggest issue, in RISC-V Volume 2, Privileged Spec v. 20190608 page 67 gave a solution: Consequently, other harts must be notified separately when the memory-management data structures have been modified. One approach is to use 1) a local data fence to ensure local writes are visible globally, then 2) an interprocessor interrupt to the other thread, then 3) a local SFENCE.VMA in the interrupt handler of the remote thread, and finally 4) signal back to originating thread that operation is complete. This is, of course, the RISC-V analog to a TLB shootdown. In general, this patch didn't handle the G bit in PTE, kernel trap it to sbi_remote_sfence_vma. do you think I should use flush_tlb_all? Jiuyang arch/arm/mm/mmu.c void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pteval) { unsigned long ext = 0; if (addr < TASK_SIZE && pte_valid_user(pteval)) { if (!pte_special(pteval)) __sync_icache_dcache(pteval); ext |= PTE_EXT_NG; } set_pte_ext(ptep, pteval, ext); } arch/mips/include/asm/pgtable.h static inline void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pteval) { if (!pte_present(pteval)) goto cache_sync_done; if (pte_present(*ptep) && (pte_pfn(*ptep) == pte_pfn(pteval))) goto cache_sync_done; __update_cache(addr, pteval); cache_sync_done: set_pte(ptep, pteval); } Also, just local TLB flush is generally not sufficient because > a lot of page tables will be used accross on multiple HARTs. On Tue, Mar 16, 2021 at 5:05 AM Anup Patel wrote: > > +Alex > > On Tue, Mar 16, 2021 at 9:20 AM Jiuyang Liu wrote: > > > > This patch inserts SFENCE.VMA after modifying PTE based on RISC-V > > specification. > > > > arch/riscv/include/asm/pgtable.h: > > 1. implement pte_user, pte_global and pte_leaf to check correspond > > attribute of a pte_t. > > Adding pte_user(), pte_global(), and pte_leaf() is fine. > > > > > 2. insert SFENCE.VMA in set_pte_at based on RISC-V Volume 2, Privileged > > Spec v. 20190608 page 66 and 67: > > If software modifies a non-leaf PTE, it should execute SFENCE.VMA with > > rs1=x0. If any PTE along the traversal path had its G bit set, rs2 must > > be x0; otherwise, rs2 should be set to the ASID for which the > > translation is being modified. > > If software modifies a leaf PTE, it should execute SFENCE.VMA with rs1 > > set to a virtual address within the page. If any PTE along the traversal > > path had its G bit set, rs2 must be x0; otherwise, rs2 should be set to > > the ASID for which the translation is being modified. > > > > arch/riscv/include/asm/tlbflush.h: > > 1. implement get_current_asid to get current program asid. > > 2. implement local_flush_tlb_asid to flush tlb with asid. > > As per my understanding, we don't need to explicitly invalidate local TLB > in set_pte() or set_pet_at() because generic Linux page table management > (/mm/*) will call the appropriate flush_tlb_xyz() function after page > table updates. Also, just local TLB flush is generally not sufficient because > a lot of page tables will be used accross on multiple HARTs. > > > > > Signed-off-by: Jiuyang Liu > > --- > > arch/riscv/include/asm/pgtable.h | 27 +++++++++++++++++++++++++++ > > arch/riscv/include/asm/tlbflush.h | 12 ++++++++++++ > > 2 files changed, 39 insertions(+) > > > > diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h > > index ebf817c1bdf4..5a47c60372c1 100644 > > --- a/arch/riscv/include/asm/pgtable.h > > +++ b/arch/riscv/include/asm/pgtable.h > > @@ -222,6 +222,16 @@ static inline int pte_write(pte_t pte) > > return pte_val(pte) & _PAGE_WRITE; > > } > > > > +static inline int pte_user(pte_t pte) > > +{ > > + return pte_val(pte) & _PAGE_USER; > > +} > > + > > +static inline int pte_global(pte_t pte) > > +{ > > + return pte_val(pte) & _PAGE_GLOBAL; > > +} > > + > > static inline int pte_exec(pte_t pte) > > { > > return pte_val(pte) & _PAGE_EXEC; > > @@ -248,6 +258,11 @@ static inline int pte_special(pte_t pte) > > return pte_val(pte) & _PAGE_SPECIAL; > > } > > > > +static inline int pte_leaf(pte_t pte) > > +{ > > + return pte_val(pte) & (_PAGE_READ | _PAGE_WRITE | _PAGE_EXEC); > > +} > > + > > /* static inline pte_t pte_rdprotect(pte_t pte) */ > > > > static inline pte_t pte_wrprotect(pte_t pte) > > @@ -358,6 +373,18 @@ static inline void set_pte_at(struct mm_struct *mm, > > flush_icache_pte(pteval); > > > > set_pte(ptep, pteval); > > + > > + if (pte_present(pteval)) { > > + if (pte_leaf(pteval)) { > > + local_flush_tlb_page(addr); > > + } else { > > + if (pte_global(pteval)) > > + local_flush_tlb_all(); > > + else > > + local_flush_tlb_asid(); > > + > > + } > > + } > > } > > > > static inline void pte_clear(struct mm_struct *mm, > > diff --git a/arch/riscv/include/asm/tlbflush.h b/arch/riscv/include/asm/tlbflush.h > > index 394cfbccdcd9..1f9b62b3670b 100644 > > --- a/arch/riscv/include/asm/tlbflush.h > > +++ b/arch/riscv/include/asm/tlbflush.h > > @@ -21,6 +21,18 @@ static inline void local_flush_tlb_page(unsigned long addr) > > { > > __asm__ __volatile__ ("sfence.vma %0" : : "r" (addr) : "memory"); > > } > > + > > +static inline unsigned long get_current_asid(void) > > +{ > > + return (csr_read(CSR_SATP) >> SATP_ASID_SHIFT) & SATP_ASID_MASK; > > +} > > + > > +static inline void local_flush_tlb_asid(void) > > +{ > > + unsigned long asid = get_current_asid(); > > + __asm__ __volatile__ ("sfence.vma x0, %0" : : "r" (asid) : "memory"); > > +} > > + > > #else /* CONFIG_MMU */ > > #define local_flush_tlb_all() do { } while (0) > > #define local_flush_tlb_page(addr) do { } while (0) > > -- > > 2.30.2 > > > > > > _______________________________________________ > > linux-riscv mailing list > > linux-riscv@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/linux-riscv > > Regards, > Anup