From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.0 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A1B4AC433E0 for ; Fri, 3 Jul 2020 10:15:25 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6CE2520720 for ; Fri, 3 Jul 2020 10:15:25 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="O/MvyUst" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6CE2520720 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References:Message-ID: Subject:To:From:Date:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=G3mOSI1EhIp2Bw4vP0oD3lzoiW9eILiZuowTIHepTyo=; b=O/MvyUstTy4k1GQEWEDRIcMyG yAFcJn7N957Dy9D2jFG5gjgD0+LWfsfmawUYFTneLxdL0wQfPsoFQuvj8Z5Ny4ZB4dC0ZcXss5FdI ZeKL7IQUjtzyAAetT14kFUkRH2GeR78ksPPanVtOS6zveOXvuIoe4JI7KWwrdBpeR7BxDl/gVp3R2 pPcD65kBPmVhAWCc2WwZ3Zt3t6eR+wv9inwz7wlvN78uEXlt1E2m75ceu+9Lx5ISyn7DCRWYfVhuk Y5+t3fLoLzF2JvfHgw6B308JZjfLXXcBfZmsPipzRg6AH9nQFwfcxvxIMfZoizvIboQPudHfRrfhd k5sDOOpew==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1jrIhY-0004HI-DM; Fri, 03 Jul 2020 10:13:48 +0000 Received: from foss.arm.com ([217.140.110.172]) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1jrIhV-0004Gn-N2 for linux-arm-kernel@lists.infradead.org; Fri, 03 Jul 2020 10:13:46 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 3C74A1042; Fri, 3 Jul 2020 03:13:43 -0700 (PDT) Received: from C02TD0UTHF1T.local (unknown [10.57.14.122]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 0B9653F68F; Fri, 3 Jul 2020 03:13:41 -0700 (PDT) Date: Fri, 3 Jul 2020 11:13:36 +0100 From: Mark Rutland To: Pingfan Liu Subject: Re: [PATCH] arm64/mm: save memory access in check_and_switch_context() fast switch path Message-ID: <20200703101336.GA31383@C02TD0UTHF1T.local> References: <1593755079-2160-1-git-send-email-kernelfans@gmail.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <1593755079-2160-1-git-send-email-kernelfans@gmail.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200703_061345_854537_59CFB708 X-CRM114-Status: GOOD ( 22.93 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jean-Philippe Brucker , Vladimir Murzin , Steve Capper , Catalin Marinas , Will Deacon , linux-arm-kernel@lists.infradead.org Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Fri, Jul 03, 2020 at 01:44:39PM +0800, Pingfan Liu wrote: > The cpu_number and __per_cpu_offset cost two different cache lines, and may > not exist after a heavy user space load. > > By replacing per_cpu(active_asids, cpu) with this_cpu_ptr(&active_asids) in > fast path, register is used and these memory access are avoided. How about: | On arm64, smp_processor_id() reads a per-cpu `cpu_number` variable, | using the per-cpu offset stored in the tpidr_el1 system register. In | some cases we generate a per-cpu address with a sequence like: | | | cpu_ptr = &per_cpu(ptr, smp_processor_id()); | | Which potentially incurs a cache miss for both `cpu_number` and the | in-memory `__per_cpu_offset` array. This can be written more optimally | as: | | | cpu_ptr = this_cpu_ptr(ptr); | | ... which only needs the offset from tpidr_el1, and does not need to | load from memory. > By replacing per_cpu(active_asids, cpu) with this_cpu_ptr(&active_asids) in > fast path, register is used and these memory access are avoided. Do you have any numbers that show benefit here? It's not clear to me how often the above case would apply where the cahes would also be hot for everything else we need, and numbers would help to justify that. > Signed-off-by: Pingfan Liu > Cc: Catalin Marinas > Cc: Will Deacon > Cc: Steve Capper > Cc: Mark Rutland > Cc: Vladimir Murzin > Cc: Jean-Philippe Brucker > To: linux-arm-kernel@lists.infradead.org > --- > arch/arm64/include/asm/mmu_context.h | 6 ++---- > arch/arm64/mm/context.c | 10 ++++++---- > 2 files changed, 8 insertions(+), 8 deletions(-) > > diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h > index ab46187..808c3be 100644 > --- a/arch/arm64/include/asm/mmu_context.h > +++ b/arch/arm64/include/asm/mmu_context.h > @@ -175,7 +175,7 @@ static inline void cpu_replace_ttbr1(pgd_t *pgdp) > * take CPU migration into account. > */ > #define destroy_context(mm) do { } while(0) > -void check_and_switch_context(struct mm_struct *mm, unsigned int cpu); > +void check_and_switch_context(struct mm_struct *mm); > > #define init_new_context(tsk,mm) ({ atomic64_set(&(mm)->context.id, 0); 0; }) > > @@ -214,8 +214,6 @@ enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk) > > static inline void __switch_mm(struct mm_struct *next) > { > - unsigned int cpu = smp_processor_id(); > - > /* > * init_mm.pgd does not contain any user mappings and it is always > * active for kernel addresses in TTBR1. Just set the reserved TTBR0. > @@ -225,7 +223,7 @@ static inline void __switch_mm(struct mm_struct *next) > return; > } > > - check_and_switch_context(next, cpu); > + check_and_switch_context(next); > } > > static inline void > diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c > index d702d60..a206655 100644 > --- a/arch/arm64/mm/context.c > +++ b/arch/arm64/mm/context.c > @@ -198,9 +198,10 @@ static u64 new_context(struct mm_struct *mm) > return idx2asid(asid) | generation; > } > > -void check_and_switch_context(struct mm_struct *mm, unsigned int cpu) > +void check_and_switch_context(struct mm_struct *mm) > { > unsigned long flags; > + unsigned int cpu; > u64 asid, old_active_asid; > > if (system_supports_cnp()) > @@ -222,9 +223,9 @@ void check_and_switch_context(struct mm_struct *mm, unsigned int cpu) > * relaxed xchg in flush_context will treat us as reserved > * because atomic RmWs are totally ordered for a given location. > */ > - old_active_asid = atomic64_read(&per_cpu(active_asids, cpu)); > + old_active_asid = atomic64_read(this_cpu_ptr(&active_asids)); > if (old_active_asid && asid_gen_match(asid) && > - atomic64_cmpxchg_relaxed(&per_cpu(active_asids, cpu), > + atomic64_cmpxchg_relaxed(this_cpu_ptr(&active_asids), > old_active_asid, asid)) > goto switch_mm_fastpath; > > @@ -236,10 +237,11 @@ void check_and_switch_context(struct mm_struct *mm, unsigned int cpu) > atomic64_set(&mm->context.id, asid); > } > > + cpu = smp_processor_id(); > if (cpumask_test_and_clear_cpu(cpu, &tlb_flush_pending)) > local_flush_tlb_all(); > > - atomic64_set(&per_cpu(active_asids, cpu), asid); > + atomic64_set(this_cpu_ptr(&active_asids), asid); > raw_spin_unlock_irqrestore(&cpu_asid_lock, flags); FWIW, this looks sound to me. Mark. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel