From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-14.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,NICE_REPLY_A,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D04AC433E2 for ; Thu, 17 Sep 2020 03:37:02 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1887A206F7 for ; Thu, 17 Sep 2020 03:37:01 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="Wq0Yd8z2"; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="dwwAx8U1" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1887A206F7 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Type: Content-Transfer-Encoding:Cc:Reply-To:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:Date: Message-ID:From:References:To:Subject:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=gKNNMWYroPOf+ARa5/nTQBlOn7hkFXWrTcDEmNid/lc=; b=Wq0Yd8z24w+hha jHp3Y8MrX0tn3OQAkRJsQpXBslP3tPGhruVQwDgnw5u81LCrZ98ZO3EYihoszGT/+PMj6e6CVtjKd lpwMGp/NaatJnaUvEnHQTEDQecxafiW5WwzWMz4FneNZDYc1Mjio7AQUrMdVwLbfw6PK19Gde4aW0 kz0mn0D37+gSJImfdkrZBC+B2iC/EvsORuvm3SKsHDKzljRPO/wgNixedkUD2F0Gt7PDFMu7GFVH7 kbUaCEI8PVCDwovLWHzPAJ7b0x+MzIRKNJFzNmdaliDHEpt7iabzsFbFkxTjlKFOHwMr1J3H6Pzye KkMqSAhEr2qVGyfJ79Fg==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kIkhe-0007ic-7V; Thu, 17 Sep 2020 03:35:22 +0000 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kIkha-0007hu-ET for linux-arm-kernel@lists.infradead.org; Thu, 17 Sep 2020 03:35:20 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1600313716; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ebH6dirgBSlsnTlDeMvRnFCa/wr7okuo7+1wJoBatCk=; b=dwwAx8U1Zvp9G/ayIh5TcTxXzObfyRG7E67DxyDAC6VnfPkM5RiJyglYLWfOllqk7zfGC3 BB4xhwWDMTfdjnYk55Nw/R41weXV3QZ+WHnDb63idGWQD6S+b7JPPH0ZjsKxQ0si0Q3xE4 FdOOLZ2EqnGDVddffnY9IzGu3sS68qM= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-129-Djts7LGiPoSOJgg5_5Zl5w-1; Wed, 16 Sep 2020 23:35:12 -0400 X-MC-Unique: Djts7LGiPoSOJgg5_5Zl5w-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 1879B100670D; Thu, 17 Sep 2020 03:35:11 +0000 (UTC) Received: from [10.64.54.113] (vpn2-54-113.bne.redhat.com [10.64.54.113]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 1F1FB76E16; Thu, 17 Sep 2020 03:35:08 +0000 (UTC) Subject: Re: [PATCH 2/2] arm64/mm: Enable color zero pages To: Will Deacon References: <20200916032523.13011-1-gshan@redhat.com> <20200916032523.13011-3-gshan@redhat.com> <20200916082819.GB27496@willie-the-truck> From: Gavin Shan Message-ID: Date: Thu, 17 Sep 2020 13:35:06 +1000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.0 MIME-Version: 1.0 In-Reply-To: <20200916082819.GB27496@willie-the-truck> Content-Language: en-US X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200916_233518_629662_6D314AB1 X-CRM114-Status: GOOD ( 29.27 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Gavin Shan Cc: mark.rutland@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, linux-kernel@vger.kernel.org, shan.gavin@gmail.com, linux-arm-kernel@lists.infradead.org Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Hi Will, On 9/16/20 6:28 PM, Will Deacon wrote: > On Wed, Sep 16, 2020 at 01:25:23PM +1000, Gavin Shan wrote: >> This enables color zero pages by allocating contigous page frames >> for it. The number of pages for this is determined by L1 dCache >> (or iCache) size, which is probbed from the hardware. >> >> * Add cache_total_size() to return L1 dCache (or iCache) size >> >> * Implement setup_zero_pages(), which is called after the page >> allocator begins to work, to allocate the contigous pages >> needed by color zero page. >> >> * Reworked ZERO_PAGE() and define __HAVE_COLOR_ZERO_PAGE. >> >> Signed-off-by: Gavin Shan >> --- >> arch/arm64/include/asm/cache.h | 22 ++++++++++++++++++++ >> arch/arm64/include/asm/pgtable.h | 9 ++++++-- >> arch/arm64/kernel/cacheinfo.c | 34 +++++++++++++++++++++++++++++++ >> arch/arm64/mm/init.c | 35 ++++++++++++++++++++++++++++++++ >> arch/arm64/mm/mmu.c | 7 ------- >> 5 files changed, 98 insertions(+), 9 deletions(-) >> >> diff --git a/arch/arm64/include/asm/cache.h b/arch/arm64/include/asm/cache.h >> index a4d1b5f771f6..420e9dde2c51 100644 >> --- a/arch/arm64/include/asm/cache.h >> +++ b/arch/arm64/include/asm/cache.h >> @@ -39,6 +39,27 @@ >> #define CLIDR_LOC(clidr) (((clidr) >> CLIDR_LOC_SHIFT) & 0x7) >> #define CLIDR_LOUIS(clidr) (((clidr) >> CLIDR_LOUIS_SHIFT) & 0x7) >> >> +#define CSSELR_TND_SHIFT 4 >> +#define CSSELR_TND_MASK (UL(1) << CSSELR_TND_SHIFT) >> +#define CSSELR_LEVEL_SHIFT 1 >> +#define CSSELR_LEVEL_MASK (UL(7) << CSSELR_LEVEL_SHIFT) >> +#define CSSELR_IND_SHIFT 0 >> +#define CSSERL_IND_MASK (UL(1) << CSSELR_IND_SHIFT) >> + >> +#define CCSIDR_64_LS_SHIFT 0 >> +#define CCSIDR_64_LS_MASK (UL(7) << CCSIDR_64_LS_SHIFT) >> +#define CCSIDR_64_ASSOC_SHIFT 3 >> +#define CCSIDR_64_ASSOC_MASK (UL(0x1FFFFF) << CCSIDR_64_ASSOC_SHIFT) >> +#define CCSIDR_64_SET_SHIFT 32 >> +#define CCSIDR_64_SET_MASK (UL(0xFFFFFF) << CCSIDR_64_SET_SHIFT) >> + >> +#define CCSIDR_32_LS_SHIFT 0 >> +#define CCSIDR_32_LS_MASK (UL(7) << CCSIDR_32_LS_SHIFT) >> +#define CCSIDR_32_ASSOC_SHIFT 3 >> +#define CCSIDR_32_ASSOC_MASK (UL(0x3FF) << CCSIDR_32_ASSOC_SHIFT) >> +#define CCSIDR_32_SET_SHIFT 13 >> +#define CCSIDR_32_SET_MASK (UL(0x7FFF) << CCSIDR_32_SET_SHIFT) > > I don't think we should be inferring cache structure from these register > values. The Arm ARM helpfully says: > > | You cannot make any inference about the actual sizes of caches based > | on these parameters. > > so we need to take the topology information from elsewhere. > Yeah, I also noticed the statement in the spec. However, the L1 cache size figured out from above registers are matching with "lscpu" on the machine where I did my tests. Note "lscpu" depends on sysfs entries whose information is retrieved from ACPI (PPTT) table. The number of cache levels are partially retrieved from system register (clidr_el1). It's doable to retrieve the L1 cache size from ACPI (PPTT) table. I'll change accordingly in v2 if this enablement is really needed. More clarify is provided below. > But before we get into that, can you justify why we need to do this at all, > please? Do you have data to show the benefit of adding this complexity? > Initially, I found it's the missed feature which has been enabled on mips/s390. Currently, all read-only anonymous VMAs are backed up by same zero page. It means all reads to these VMAs are cached by same set of cache, but still multiple ways if supported. So it would be nice to have multiple zero pages to back up these read-only anonymous VMAs, so that the reads on them can be cached by multiple sets (multiple ways still if supported). It's overall beneficial to the performance. Unfortunately, I didn't find a machine where the size of cache set is larger than page size. So I had one experiment as indication how L1 data cache miss affects the overall performance: L1 data cache size: 32KB L1 data cache line size: 64 Number of L1 data cache set: 64 Number of L1 data cache ways: 8 ---------------------------------------------------------------------- size = (cache_line_size) * (num_of_sets) * (num_of_ways) Kernel configuration: VA_BITS: 48 PAGE_SIZE: 4KB PMD HugeTLB Page Size: 2MB Experiment: I have a program to do the following things and check the consumed time and L1-data-cache-misses by perf. (1) Allocate (mmap) a PMD HugeTLB Page, which is 2MB. (2) Read on the mmap'd region in step of page size (4KB) for 8 or 9 times. Note 8 is the number of data cache ways. (3) Repeat (2) for 1000000 times. Result: (a) when we have 8 for the steps in (2): 37,103 L1-dcache-load-misses 0.217522515 seconds time elapsed 0.217564000 seconds user 0.000000000 seconds sys (b) when we have 9 for the steps in (2): 4,687,932 L1-dcache-load-misses (126 times) 0.248132105 seconds time elapsed (+14.2%) 0.248267000 seconds user 0.000000000 seconds sys Please let me know if it's worthy for a v2, to retrieve the cache size from ACPI (PPTT) table. The cost is to allocate multiple zero pages and the worst case is fail back to one zero page, as before :) Cheers, Gavin _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel