From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.0 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,SIGNED_OFF_BY,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AA992C43381 for ; Thu, 28 Mar 2019 14:30:54 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 762F6206C0 for ; Thu, 28 Mar 2019 14:30:54 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="MhQPV5Hp"; dkim=fail reason="signature verification failed" (1024-bit key) header.d=garyguo.net header.i=@garyguo.net header.b="l0CcnDvh" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 762F6206C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=garyguo.net Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-riscv-bounces+infradead-linux-riscv=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:Content-ID:In-Reply-To: References:Message-ID:Date:Subject:To:From:Reply-To:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=h8UN9jqRsOMsVmIHgfe7d/i+HReGm2UGz5BEYSFv/xA=; b=MhQPV5HpCo16Oc 27YG3tnq6NZ0cwi21T3sgB/O2AFRlG6KqVjkIyK9rnJ9cbGizJ4aAFcaCZ1lpK5Ns83OomGkXXV7p CfCC/eSbYwU+gstLCORFEgsn5sfgeLv6PjaWkX/Xv210UdIHWcKtadh0E6gyyZjxgpFhHGrrqbi3Z 12IrVXxWIWPM01t2/i2Q/C1uHu53aL0gJj0Doetydkm47NmeCHHCv+dZsNUhYvYUXx/Am4Up6UNpm FDTpy6q8rGhz7exxcg1Y3tfPpfj+gqPbG3YXIL10T881TEIIwvyBbVgVOcTwGkzspUxAnwY7TYO1y BxKnuKcyBathqVz/Ar7Q==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.90_1 #2 (Red Hat Linux)) id 1h9W3Q-0008Lf-6z; Thu, 28 Mar 2019 14:30:52 +0000 Received: from mail-eopbgr110111.outbound.protection.outlook.com ([40.107.11.111] helo=GBR01-CWL-obe.outbound.protection.outlook.com) by bombadil.infradead.org with esmtps (Exim 4.90_1 #2 (Red Hat Linux)) id 1h9W3B-0008CK-3U for linux-riscv@lists.infradead.org; Thu, 28 Mar 2019 14:30:51 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=garyguo.net; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=LLku74UuaB8iDyQBu8nmsgnmcpt9RVhNzt9ASrKTdc0=; b=l0CcnDvhHDwu0kHKx2fvNoHOletqUPkQhRbnlOUHItq74xTpHvOgpjN4ByVBOz+7azH9iomTRlBsjXR+hwI/P105pfBXmMyQ6vj2XLlY7xeVgQweCvv6gwsCn5HyLuitAKXZ5MylSiEavPvqO5JBuAmQLcC1h3gPRh7LKITJLw8= Received: from LO2P265MB0847.GBRP265.PROD.OUTLOOK.COM (20.176.139.20) by LO2P265MB1710.GBRP265.PROD.OUTLOOK.COM (20.176.146.9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1750.17; Thu, 28 Mar 2019 14:30:29 +0000 Received: from LO2P265MB0847.GBRP265.PROD.OUTLOOK.COM ([fe80::ed34:1290:4306:3157]) by LO2P265MB0847.GBRP265.PROD.OUTLOOK.COM ([fe80::ed34:1290:4306:3157%3]) with mapi id 15.20.1750.017; Thu, 28 Mar 2019 14:30:29 +0000 From: Gary Guo To: Anup Patel Subject: Re: [PATCH v2] RISC-V: Implement ASID allocator Thread-Topic: [PATCH v2] RISC-V: Implement ASID allocator Thread-Index: AQHU5TAIKaRDCttA7UazIjpcDaomMKYhDFQAgAAI94CAAAXngA== Date: Thu, 28 Mar 2019 14:30:28 +0000 Message-ID: <4f290078-54bd-62e2-6356-d700e53e6315@garyguo.net> References: <20190328063211.13052-1-anup.patel@wdc.com> In-Reply-To: Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-clientproxiedby: LNXP265CA0055.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:5d::19) To LO2P265MB0847.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:8c::20) authentication-results: spf=none (sender IP is ) smtp.mailfrom=gary@garyguo.net; x-ms-exchange-messagesentrepresentingtype: 1 x-originating-ip: [2001:630:212:238:3697:f6ff:fe55:55b1] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 289d596c-f20f-40ce-1570-08d6b389ecf2 x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(7021145)(8989299)(5600127)(711020)(4605104)(4534185)(7022145)(4603075)(4627221)(201702281549075)(8990200)(7048125)(7024125)(7027125)(7023125)(2017052603328)(7153060)(7193020); SRVR:LO2P265MB1710; x-ms-traffictypediagnostic: LO2P265MB1710: x-ms-exchange-purlcount: 1 x-microsoft-antispam-prvs: x-forefront-prvs: 0990C54589 x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(376002)(396003)(136003)(346002)(366004)(39830400003)(55674003)(189003)(199004)(86362001)(97736004)(68736007)(6436002)(31686004)(6916009)(508600001)(46003)(6306002)(446003)(11346002)(476003)(14444005)(486006)(256004)(81166006)(8676002)(6486002)(81156014)(7416002)(4326008)(2616005)(6512007)(7736002)(305945005)(106356001)(5660300002)(105586002)(30864003)(2906002)(6116002)(229853002)(8936002)(966005)(25786009)(31696002)(316002)(186003)(53946003)(54906003)(6506007)(386003)(53546011)(53936002)(52116002)(71200400001)(76176011)(102836004)(99286004)(6246003)(36756003)(71190400001)(14454004); DIR:OUT; SFP:1102; SCL:1; SRVR:LO2P265MB1710; H:LO2P265MB0847.GBRP265.PROD.OUTLOOK.COM; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1; received-spf: None (protection.outlook.com: garyguo.net does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-microsoft-antispam-message-info: 5lEyARnaKsHX1pKZUMTsi4fZPhhnXirg1DlzY6LXGMwga0uatSpuBrWxjWxlZrLSTiijK2pS61SUtuVOSfvvrpEFv1vGgWsZY3g8zg2C4NnLFS4qdwaiwm6di0wJI34XR9tEIuJmH46DmdU+GyAPfqAu9AKC4+RYOo3WdgLQKvPp7uKZHey/Zx10G1TvEhIyHmuXVFjIfg3pzwrxemhHA8EhlR747hcXtcsfOyJSP5W/z7cFKIjhJvFtCnSGYZuz3lKcQC3SE5wXWat4AoC1Qhh02W/ZRAAl329p1GMiIwVRDEDFX1CHAsMsUdIs+KN5C0xL9ZBhtZ44p2a0xtWlv0lp6R8RTND4uABzhdZn6pNBdFWyKbv3GrFitn4Orp9v5vjSaI9dgn1iqmDZa6NSSdEjfTDwWkDGzC7r9F7XfWY= Content-ID: <895CA4AAF4391A48AE2261C0E36ECE82@GBRP265.PROD.OUTLOOK.COM> MIME-Version: 1.0 X-OriginatorOrg: garyguo.net X-MS-Exchange-CrossTenant-Network-Message-Id: 289d596c-f20f-40ce-1570-08d6b389ecf2 X-MS-Exchange-CrossTenant-originalarrivaltime: 28 Mar 2019 14:30:28.8614 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: bbc898ad-b10f-4e10-8552-d9377b823d45 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-Transport-CrossTenantHeadersStamped: LO2P265MB1710 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20190328_073041_194738_7B09B1FA X-CRM114-Status: GOOD ( 23.11 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Palmer Dabbelt , Anup Patel , "linux-kernel@vger.kernel.org" , Mike Rapoport , Christoph Hellwig , Atish Patra , Albert Ou , Paul Walmsley , "linux-riscv@lists.infradead.org" Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-riscv" Errors-To: linux-riscv-bounces+infradead-linux-riscv=archiver.kernel.org@lists.infradead.org On 28/03/2019 14:09, Anup Patel wrote: > On Thu, Mar 28, 2019 at 7:07 PM Gary Guo wrote: >> >> Hi Anup, >> >> The code still does not use ASID in TLB flush routines. Without this >> added the code does not boot on systems with true ASID support. > > Can you elaborate why flush by ASID is need and flush_tlb_all() will > not work? > flush_tlb_all() will work, but not flush_tlb_mm, flush_tlb_page, flush_tlb_range. When we want to flush something related to a MM we need to get its ASID and SFENCE with that ASID. >> >> We also need to consider the case of CONTEXTID overflow on 32-bit >> systems. 32-bit CONTEXTID may overflow in a month time. > > On 32bit systems, upper 24bits of CONTEXTID will be VERSION and > lower 8bits will be HW ASID. > > Can you elaborate how did you reach to conclusion that CONTEXID > will overflow in a month time? > Assume a case where we have 256 processes to run, and 8 cores, 2^32/(250Hz)/8 = 24 days. >> >> Please all see my inline comments. >> >> Best, >> Gary >> >> On 28/03/2019 06:32, Anup Patel wrote: >>> Currently, we do local TLB flush on every MM switch. This is very harsh >>> on performance because we are forcing page table walks after every MM >>> switch. >>> >>> This patch implements ASID allocator for assigning an ASID to every MM >>> context. The number of ASIDs are limited in HW so we create a logical >>> entity named CONTEXTID for assigning to MM context. The lower bits of >>> CONTEXTID are ASID and upper bits are VERSION number. The number of >>> usable ASID bits supported by HW are detected at boot-time by writing >>> 1s to ASID bits in SATP CSR. This means last ASID is always reserved >>> because it is used for initial MM context. >>> >>> We allocate new CONTEXTID on first MM switch for a MM context where >>> the ASID is allocated from an ASID bitmap and VERSION is provide by >>> an atomic counter. At time of allocating new CONTEXTID, if we run out >>> of available ASIDs then: >>> 1. We flush the ASID bitmap >>> 2. Increment current VERSION atomic counter >>> 3. Re-allocate ASID from ASID bitmap >>> 4. Flush TLB on all CPUs >>> 5. Try CONTEXTID re-assignment on all CPUs >>> >>> Using above approach, we have virtually infinite CONTEXTIDs on-top-of >>> limited number of HW ASIDs. This approach is inspired from ASID allocator >>> used for Linux ARM/ARM64 but we have adapted it for RISC-V. Overall, >>> this ASID allocator helps us reduce rate of local TLB flushes on every >>> CPU thereby increasing performance. >>> >>> This patch is tested on QEMU/virt machine and SiFive Unleashed board. >>> On QEMU/virt machine, we see 10% (approx) performance improvement with >>> SW emulated TLBs provided by QEMU. Unfortunately, ASID bits of SATP CSR >>> are not implemented on SiFive Unleashed board so we don't see any change >>> in performance. >>> >>> Signed-off-by: Gary Guo >> Could you add a Co-developed-by line in addition to Signed-off-by as >> well? Thanks. > > Sure, I will add. > >>> Signed-off-by: Anup Patel >>> --- >>> Changes since v1: >>> - We adapt good aspects from Gary Guo's ASID allocator implementation >>> and provide due credit to him by adding his SoB. >>> - Track ASIDs active during context flush and mark them as reserved >>> - Set ASID bits to all 1s to simplify number of ASID bit detection >>> - Use atomic_long_t instead of atomic64_t for being 32bit friendly >>> - Use unsigned long instead of u64 for being 32bit friendly >>> - Use flush_tlb_all() instead of lazy local_tlb_flush_all() at time >>> of context flush >>> >>> This patch is based on Linux-5.1-rc2 and TLB flush cleanup patches v4 >>> from Gary Guo. It can be also found in riscv_asid_allocator_v2 branch >>> of https://github.com/avpatel/linux.git >>> --- >>> arch/riscv/include/asm/csr.h | 6 + >>> arch/riscv/include/asm/mmu.h | 1 + >>> arch/riscv/include/asm/mmu_context.h | 1 + >>> arch/riscv/kernel/head.S | 2 + >>> arch/riscv/mm/context.c | 249 +++++++++++++++++++++++++-- >>> 5 files changed, 247 insertions(+), 12 deletions(-) >>> >>> diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h >>> index 28a0d1cb374c..ce18ab8f53ed 100644 >>> --- a/arch/riscv/include/asm/csr.h >>> +++ b/arch/riscv/include/asm/csr.h >>> @@ -45,10 +45,16 @@ >>> #define SATP_PPN _AC(0x003FFFFF, UL) >>> #define SATP_MODE_32 _AC(0x80000000, UL) >>> #define SATP_MODE SATP_MODE_32 >>> +#define SATP_ASID_BITS 9 >>> +#define SATP_ASID_SHIFT 22 >>> +#define SATP_ASID_MASK _AC(0x1FF, UL) >>> #else >>> #define SATP_PPN _AC(0x00000FFFFFFFFFFF, UL) >>> #define SATP_MODE_39 _AC(0x8000000000000000, UL) >>> #define SATP_MODE SATP_MODE_39 >>> +#define SATP_ASID_BITS 16 >>> +#define SATP_ASID_SHIFT 44 >>> +#define SATP_ASID_MASK _AC(0xFFFF, UL) >>> #endif >>> >>> /* Interrupt Enable and Interrupt Pending flags */ >>> diff --git a/arch/riscv/include/asm/mmu.h b/arch/riscv/include/asm/mmu.h >>> index 5df2dccdba12..42a9ca0fe1fb 100644 >>> --- a/arch/riscv/include/asm/mmu.h >>> +++ b/arch/riscv/include/asm/mmu.h >>> @@ -18,6 +18,7 @@ >>> #ifndef __ASSEMBLY__ >>> >>> typedef struct { >>> + atomic_long_t id; >>> void *vdso; >>> #ifdef CONFIG_SMP >>> /* A local icache flush is needed before user execution can resume. */ >>> diff --git a/arch/riscv/include/asm/mmu_context.h b/arch/riscv/include/asm/mmu_context.h >>> index bf4f097a9051..ba6ab35c18dc 100644 >>> --- a/arch/riscv/include/asm/mmu_context.h >>> +++ b/arch/riscv/include/asm/mmu_context.h >>> @@ -30,6 +30,7 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, >>> static inline int init_new_context(struct task_struct *task, >>> struct mm_struct *mm) >>> { >>> + atomic_long_set(&(mm)->context.id, 0); >> Parenthesis around mm isn't necessary > > Okay, I will update. > >>> return 0; >>> } >>> >>> diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S >>> index fe884cd69abd..c3f9adc0d054 100644 >>> --- a/arch/riscv/kernel/head.S >>> +++ b/arch/riscv/kernel/head.S >>> @@ -95,6 +95,8 @@ relocate: >>> la a2, swapper_pg_dir >>> srl a2, a2, PAGE_SHIFT >>> li a1, SATP_MODE >>> + li a0, (SATP_ASID_MASK << SATP_ASID_SHIFT) >>> + or a1, a1, a0 >>> or a2, a2, a1 >>> >>> /* >>> diff --git a/arch/riscv/mm/context.c b/arch/riscv/mm/context.c >>> index 0f787bcd3a7a..1205d33d1b1b 100644 >>> --- a/arch/riscv/mm/context.c >>> +++ b/arch/riscv/mm/context.c >>> @@ -2,13 +2,209 @@ >>> /* >>> * Copyright (C) 2012 Regents of the University of California >>> * Copyright (C) 2017 SiFive >>> + * Copyright (C) 2019 Western Digital Corporation or its affiliates. >>> */ >>> >>> +#include >>> #include >>> +#include >>> >>> #include >>> #include >>> >>> +static bool use_asid_allocator; >>> +static unsigned long asid_bits; >>> +static unsigned long num_asids; >>> +static unsigned long asid_mask; >>> +static unsigned long first_version; >>> + >>> +static atomic_long_t current_version; >>> + >>> +static DEFINE_RAW_SPINLOCK(context_lock); >>> +static unsigned long *context_asid_map; >>> + >>> +static DEFINE_PER_CPU(atomic_long_t, active_context); >>> +static DEFINE_PER_CPU(unsigned long, reserved_context); >>> + >>> +static bool check_update_reserved_context(unsigned long cntx, >>> + unsigned long newcntx) >>> +{ >>> + int cpu; >>> + bool hit = false; >>> + >>> + /* >>> + * Iterate over the set of reserved CONTEXT looking for a match. >>> + * If we find one, then we can update our mm to use new CONTEXT >>> + * (i.e. the same CONTEXT in the current_version) but we can't >>> + * exit the loop early, since we need to ensure that all copies >>> + * of the old CONTEXT are updated to reflect the mm. Failure to do >>> + * so could result in us missing the reserved CONTEXT in a future >>> + * version. >>> + */ >>> + for_each_possible_cpu(cpu) { >>> + if (per_cpu(reserved_context, cpu) == cntx) { >>> + hit = true; >>> + per_cpu(reserved_context, cpu) = newcntx; >>> + } >>> + } >>> + >>> + return hit; >>> +} >>> + >>> +/* Note: must be called with context_lock held */ >>> +static void __flush_context(void) >>> +{ >>> + int i; >>> + unsigned long cntx; >>> + >>> + /* Update the list of reserved ASIDs and the ASID bitmap. */ >>> + bitmap_clear(context_asid_map, 0, num_asids); >>> + >>> + /* Mark already acitve ASIDs as used */ >>> + for_each_possible_cpu(i) { >>> + cntx = atomic_long_xchg_relaxed(&per_cpu(active_context, i), 0); >>> + /* >>> + * If this CPU has already been through a rollover, but >>> + * hasn't run another task in the meantime, we must preserve >>> + * its reserved CONTEXT, as this is the only trace we have of >>> + * the process it is still running. >>> + */ >>> + if (cntx == 0) >>> + cntx = per_cpu(reserved_context, i); >>> + >>> + __set_bit(cntx & asid_mask, context_asid_map); >>> + per_cpu(reserved_context, i) = cntx; >>> + } >>> + >>> + /* Mark last ASID as used because it is used at boot-time */ >>> + __set_bit(asid_mask, context_asid_map); >> Looks unnecessary as we always start find_next_zero_bit from idx 1. > > This is to ensure that we never use last ASID > Uh, sorry. I misread. But we surely can use the last ASID after the first rollover? >>> +} >>> + >>> +/* Note: must be called with context_lock held */ >>> +static unsigned long __new_context(struct mm_struct *mm, >>> + bool *need_tlb_flush) >>> +{ >>> + static u32 cur_idx = 1; >>> + unsigned long cntx = atomic_long_read(&mm->context.id); >>> + unsigned long asid, ver = atomic_long_read(¤t_version); >>> + >>> + if (cntx != 0) { >>> + unsigned long newcntx = ver | (cntx & ~asid_mask); >> Shouldn't this be cntx & asid_mask ? > > Ahh, typo. Thanks for catching. > >>> + >>> + /* >>> + * If our current CONTEXT was active during a rollover, we >>> + * can continue to use it and this was just a false alarm. >>> + */ >>> + if (check_update_reserved_context(cntx, newcntx)) >>> + return newcntx; >>> + >>> + /* >>> + * We had a valid CONTEXT in a previous life, so try to >>> + * re-use it if possible. >>> + */ >>> + if (!__test_and_set_bit(cntx & asid_mask, context_asid_map)) >>> + return newcntx; >>> + } >>> + >>> + /* >>> + * Allocate a free ASID. If we can't find one then increment >>> + * current_version and flush all ASIDs. >>> + */ >>> + asid = find_next_zero_bit(context_asid_map, num_asids, cur_idx); >>> + if (asid != num_asids) >>> + goto set_asid; >>> + >>> + /* We're out of ASIDs, so increment current_version */ >>> + ver = atomic_long_add_return_relaxed(first_version, ¤t_version); >>> + >>> + /* Flush everything */ >>> + __flush_context(); >>> + *need_tlb_flush = true; >>> + >>> + /* We have more ASIDs than CPUs, so this will always succeed */ >>> + asid = find_next_zero_bit(context_asid_map, num_asids, 1); >>> + >>> +set_asid: >>> + __set_bit(asid, context_asid_map); >>> + cur_idx = asid; >>> + return asid | ver; >>> +} >>> + >>> +static void set_mm_asid(struct mm_struct *mm, unsigned int cpu) >>> +{ >>> + unsigned long flags; >>> + bool need_tlb_flush = false; >>> + unsigned long cntx, old_active_cntx; >>> + >>> + cntx = atomic_long_read(&mm->context.id); >>> + >>> + /* >>> + * If our active_context is non-zero and the context matches the >>> + * current_version, then we update the active_context entry with a >>> + * relaxed cmpxchg. >>> + * >>> + * Following is how we handle racing with a concurrent rollover: >>> + * >>> + * - We get a zero back from the cmpxchg and end up waiting on the >>> + * lock. Taking the lock synchronises with the rollover and so >>> + * we are forced to see the updated verion. >>> + * >>> + * - We get a valid context back from the cmpxchg then we continue >>> + * using old ASID because __flush_context() would have marked ASID >>> + * of active_context as used and next context switch we will allocate >>> + * new context. >>> + */ >>> + old_active_cntx = atomic_long_read(&per_cpu(active_context, cpu)); >>> + if (old_active_cntx && >>> + !((cntx ^ atomic_long_read(¤t_version)) >> asid_bits) && >> This looks hard to read. Probably >> (cntx &~ asid_mask) == atomic_long_read(¤t_version) >> is clearer. > > No issues, I am fine with either way. I will update. > >>> + atomic_long_cmpxchg_relaxed(&per_cpu(active_context, cpu), >>> + old_active_cntx, cntx)) >>> + goto switch_mm_fast; >>> + >>> + raw_spin_lock_irqsave(&context_lock, flags); >> Any reason that we prefer raw_ here? >>> + >>> + /* Check that our ASID belongs to the current_version. */ >>> + cntx = atomic_long_read(&mm->context.id); >>> + if ((cntx ^ atomic_long_read(¤t_version)) >> asid_bits) { >> Same as above, probably >> (cntx &~ asid_mask) != atomic_long_read(¤t_version) >> makes more sense. >>> + cntx = __new_context(mm, &need_tlb_flush); >>> + atomic_long_set(&mm->context.id, cntx); >>> + } >>> + >>> + atomic_long_set(&per_cpu(active_context, cpu), cntx); >>> + >>> + raw_spin_unlock_irqrestore(&context_lock, flags); >>> + >>> +switch_mm_fast: >>> + /* >>> + * Use the old spbtr name instead of using the current satp >>> + * name to support binutils 2.29 which doesn't know about the >>> + * privileged ISA 1.10 yet. >>> + */ >>> + csr_write(sptbr, virt_to_pfn(mm->pgd) | >>> + ((cntx & asid_mask) << SATP_ASID_SHIFT) | >>> + SATP_MODE); >>> + >>> + if (need_tlb_flush) >>> + flush_tlb_all(); >> I think your intention here is to avoid calling flush_tlb_all when IRQs >> are off in the critical region. However, switch_mm will be called from >> scheduler as well which also turn irqs off, so this still cause issue. I >> think a better way is to force flush_tlb_all to use SBI when IRQs are >> off. What do you think? > > We are still waiting for OpenSBI to provide complete implementation. > > I agree that we should prefer SBI based remote TLB flush all here. Let's > wait for more comments. > I prefer SBI as well. Can you reopen OpenSBI issue #87 to track the progress until we can proper handle race conditions in OpenSBI? Once that's completed I'll drop the IPI patch and we can safely do flush_tlb_all within __flush_context. >>> +} >>> + >>> +static void set_mm_noasid(struct mm_struct *mm) >>> +{ >>> + /* >>> + * Use the old spbtr name instead of using the current satp >>> + * name to support binutils 2.29 which doesn't know about the >>> + * privileged ISA 1.10 yet. >>> + */ >>> + csr_write(sptbr, virt_to_pfn(mm->pgd) | SATP_MODE); >>> + >>> + /* >>> + * sfence.vma after SATP write. We call it on MM context instead of >>> + * calling local_flush_tlb_all to prevent global mappings from being >>> + * affected. >>> + */ >>> + local_flush_tlb_mm(mm); >>> +} >>> + >>> /* >>> * When necessary, performs a deferred icache flush for the given MM context, >>> * on the local CPU. RISC-V has no direct mechanism for instruction cache >>> @@ -58,20 +254,49 @@ void switch_mm(struct mm_struct *prev, struct mm_struct *next, >>> cpumask_clear_cpu(cpu, mm_cpumask(prev)); >>> cpumask_set_cpu(cpu, mm_cpumask(next)); >>> >>> - /* >>> - * Use the old spbtr name instead of using the current satp >>> - * name to support binutils 2.29 which doesn't know about the >>> - * privileged ISA 1.10 yet. >>> - */ >>> - csr_write(sptbr, virt_to_pfn(next->pgd) | SATP_MODE); >>> + if (use_asid_allocator) >>> + set_mm_asid(next, cpu); >>> + else >>> + set_mm_noasid(next); >>> + >>> + flush_icache_deferred(next); >>> +} >>> + >>> +static int asids_init(void) >>> +{ >>> + /* Figure-out number of ASID bits in HW */ >>> + asid_bits = (csr_read(sptbr) >> SATP_ASID_SHIFT) & SATP_ASID_MASK; >>> + asid_bits = fls_long(asid_bits); >>> + >>> + /* Pre-compute ASID details */ >>> + num_asids = 1 << asid_bits; >>> + asid_mask = num_asids - 1; >>> + first_version = num_asids; >> Is there any reason we want to have two set-once variables with same value? > > Yap, "first_version" looks redundant. I will update. > >>> >>> /* >>> - * sfence.vma after SATP write. We call it on MM context instead of >>> - * calling local_flush_tlb_all to prevent global mappings from being >>> - * affected. >>> + * Use ASID allocator only if number of HW ASIDs are >>> + * at-least twice more than CPUs >>> */ >>> - local_flush_tlb_mm(next); >>> + use_asid_allocator = >>> + (num_asids <= (2 * num_possible_cpus())) ? false : true; >>> >>> - flush_icache_deferred(next); >>> -} >>> + /* Setup ASID allocator if available */ >>> + if (use_asid_allocator) { >>> + atomic_long_set(¤t_version, first_version); >>> >>> + context_asid_map = kcalloc(BITS_TO_LONGS(num_asids), >>> + sizeof(*context_asid_map), GFP_KERNEL); >>> + if (!context_asid_map) >>> + panic("Failed to allocate bitmap for %lu ASIDs\n", >>> + num_asids); >>> + >>> + __set_bit(asid_mask, context_asid_map); >>> + >>> + pr_info("ASID allocator using %lu entries\n", num_asids); >>> + } else { >> If we decide not to use ASID allocator, we will need to set ASID field >> to zero on *all harts* before we do our first switch_mm. Otherwise we >> will end up a hart running non-zero ASID and another running zero ASID >> with different page table. > > Yes, I saw that in your implementation but for better readability and > debugability. I have preserved asid_bits that we computed and added > separate use_asid_allocator flag. I didn't say I'm against having use_asid_allocator. > > In future, I plan to show asid_bits in /proc/cpuinfo as-well. > >>> + pr_info("ASID allocator disabled\n"); >>> + } >>> + >>> + return 0; >>> +} >>> +early_initcall(asids_init); >>> > > Regards, > Anup > _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv