From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.1 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_NEOMUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3E89AC43613 for ; Mon, 24 Jun 2019 10:40:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0EBE42146E for ; Mon, 24 Jun 2019 10:40:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1561372817; bh=HxUEH3z4leC4dJ5SSdD+Hrod9xxZtdkXEJIvmvfkDls=; h=Date:From:To:Cc:Subject:References:In-Reply-To:List-ID:From; b=1YAyUaY/TzD1ru5XGFe5uWj7H5Z+DHIsD5EyPQcO65W/UfOQdEO65y6+oEOQxOeAQ H8dGwrakcOoESKL3Z50uZHD6v/TDc0CzA5gtl88pWYFpc3nRPKTjcYtVURZkqd/SBB 0LCuaCNLDmh9FiLmCaEYneR3bF7r+Lfhm2Z7PrIY= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729170AbfFXKkP (ORCPT ); Mon, 24 Jun 2019 06:40:15 -0400 Received: from mail.kernel.org ([198.145.29.99]:35660 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728966AbfFXKkP (ORCPT ); Mon, 24 Jun 2019 06:40:15 -0400 Received: from willie-the-truck (236.31.169.217.in-addr.arpa [217.169.31.236]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 95F05208E4; Mon, 24 Jun 2019 10:40:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1561372814; bh=HxUEH3z4leC4dJ5SSdD+Hrod9xxZtdkXEJIvmvfkDls=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=p4PZ9wV2TnMIZpxRJidZl+RtPVe+FUZBi7moRglIiodLBtKwmf1oEV4xqb5V+mh/L Ugv88bzt/3W9tu/LDyR9Ve1JBjk3ghhMOMhvuEYNmO1szy9MNLn7zjJIf9MKdbIT/h Y4mOVdujWbYLgUJb5+1Frs1BuqhFX3eWJC3Rsdvc= Date: Mon, 24 Jun 2019 11:40:07 +0100 From: Will Deacon To: Guo Ren Cc: Will Deacon , julien.thierry@arm.com, aou@eecs.berkeley.edu, james.morse@arm.com, Arnd Bergmann , suzuki.poulose@arm.com, Marc Zyngier , catalin.marinas@arm.com, Anup Patel , linux-kernel@vger.kernel.org, rppt@linux.ibm.com, hch@infradead.org, Atish.Patra@wdc.com, Julien Grall , Palmer Dabbelt , gary@garyguo.net, paul.walmsley@sifive.com, christoffer.dall@arm.com, linux-riscv@lists.infradead.org, kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org Subject: Re: [PATCH RFC 11/14] arm64: Move the ASID allocator code in a separate file Message-ID: <20190624104006.lvm32nahemaqklxc@willie-the-truck> References: <20190321163623.20219-1-julien.grall@arm.com> <20190321163623.20219-12-julien.grall@arm.com> <0dfe120b-066a-2ac8-13bc-3f5a29e2caa3@arm.com> <20190619091219.GB7767@fuggles.cambridge.arm.com> <20190619123939.GF7767@fuggles.cambridge.arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20170113 (1.7.2) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 20, 2019 at 05:33:03PM +0800, Guo Ren wrote: > On Wed, Jun 19, 2019 at 8:39 PM Will Deacon wrote: > > > > On Wed, Jun 19, 2019 at 08:18:04PM +0800, Guo Ren wrote: > > > On Wed, Jun 19, 2019 at 5:12 PM Will Deacon wrote: > > > > This is one place where I'd actually prefer not to go down the route of > > > > making the code generic. Context-switching and low-level TLB management > > > > is deeply architecture-specific and I worry that by trying to make this > > > > code common, we run the real risk of introducing subtle bugs on some > > > > architecture every time it is changed. > > > "Add generic asid code" and "move arm's into generic" are two things. > > > We could do > > > first and let architecture's maintainer to choose. > > > > If I understand the proposal being discussed, it involves basing that > > generic ASID allocation code around the arm64 implementation which I don't > > necessarily think is a good starting point. > ... > > > > > > Furthermore, the algorithm we use > > > > on arm64 is designed to scale to large systems using DVM and may well be > > > > too complex and/or sub-optimal for architectures with different system > > > > topologies or TLB invalidation mechanisms. > > > It's just a asid algorithm not very complex and there is a callback > > > for architecture to define their > > > own local hart tlb flush. Seems it has nothing with DVM or tlb > > > broadcast mechanism. > > > > I'm pleased that you think the algorithm is not very complex, but I'm also > > worried that you might not have fully understood some of its finer details. > I understand your concern about my less understanding of asid > technology. Here is > my short-description of arm64 asid allocator: (If you find anything > wrong, please > correct me directly, thx :) The complexity mainly comes from the fact that this thing runs concurrently with itself without synchronization on the fast-path. Coupled with the need to use the same ASID for all threads of a task, you end up in fiddly situations where rollover can occur on one CPU whilst another CPU is trying to schedule a thread of a task that already has threads running in userspace. However, it's architecture-specific whether or not you care about that scenario. > > The reason I mention DVM and TLB broadcasting is because, depending on > > the mechanisms in your architecture relating to those, it may be strictly > > required that all concurrently running threads of a process have the same > > ASID at any given point in time, or it may be that you really don't care. > > > > If you don't care, then the arm64 allocator is over-engineered and likely > > inefficient for your system. If you do care, then it's worth considering > > whether a lock is sufficient around the allocator if you don't expect high > > core counts. Another possibility is that you end up using only one ASID and > > invalidating the local TLB on every context switch. Yet another design > > would be to manage per-cpu ASID pools. > I'll keep my system use the same ASID for SMP + IOMMU :P You will want a separate allocator for that: https://lkml.kernel.org/r/20190610184714.6786-2-jean-philippe.brucker@arm.com > Yes, there are two styles of asid allocator: per-cpu ASID (MIPS) or > same ASID (ARM). > If the CPU couldn't support cache/tlb coherency maintian in hardware, > it should use > per-cpu ASID style because IPI is expensive and per-cpu ASID style > need more software > mechanism to improve performance (eg: delay cache flush). From software view the > same ASID is clearer and easier to build bigger system with more TLB caches. > > I think the same ASID style is a more sensible choice for modern > processor and let it be > one of generic is reasonable. I'm not sure I agree. x86, for example, is better off using a different algorithm for allocating its PCIDs. > > So rather than blindly copying the arm64 code, I suggest sitting down and > > designing something that fits to your architecture instead. You may end up > > with something that is both simpler and more efficient. > In fact, riscv folks have discussed a lot about arm's asid allocator > and I learned > a lot from the discussion: > https://lore.kernel.org/linux-riscv/20190327100201.32220-1-anup.patel@wdc.com/ If you require all threads of the same process to have the same ASID, then that patch looks broken to me. Will From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_NEOMUTT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E7C9EC43613 for ; Mon, 24 Jun 2019 10:40:41 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id BC8B4208E4 for ; Mon, 24 Jun 2019 10:40:41 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="YsplF5XH"; dkim=fail reason="signature verification failed" (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="p4PZ9wV2" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BC8B4208E4 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-riscv-bounces+infradead-linux-riscv=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=P9/lSpL5YrplHYiUO9ybeyjNloHJazWgBRG5XU+OCHU=; b=YsplF5XHNVLW7z gUv+qmdn0f/qUmfDEVoMXCkMFtdz+bx9WJCV9RWyPl9QGgk8HlAtXqgXl+osKsRHD+Ji/E3Q5Piuv GcEMi6p/Xt/z9b8YvG4EOflyQBYLVT4U3B6p1MWeaZRB71C6gb3UkdUvKldXxlg5uy559WWlOWAB0 LjfNGI8pfaaBNe0ikQV0fIr/9ImcqzAEmxtyn4TowbnFw+Wu7Vosx0JNJWuUnM8Kwa3bzXqi1vXSe w37tEMoqxKjjtmIcCqomL9nUmnxtzHua5JTA3KJSsWlrZFi2eKLHVPwjqkFHK23xfXXN/AcnChQN/ NGsFmzq+0mOD4Rm6h07w==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92 #3 (Red Hat Linux)) id 1hfMOp-0003gu-LC; Mon, 24 Jun 2019 10:40:35 +0000 Received: from mail.kernel.org ([198.145.29.99]) by bombadil.infradead.org with esmtps (Exim 4.92 #3 (Red Hat Linux)) id 1hfMOU-0003U7-DO; Mon, 24 Jun 2019 10:40:16 +0000 Received: from willie-the-truck (236.31.169.217.in-addr.arpa [217.169.31.236]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 95F05208E4; Mon, 24 Jun 2019 10:40:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1561372814; bh=HxUEH3z4leC4dJ5SSdD+Hrod9xxZtdkXEJIvmvfkDls=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=p4PZ9wV2TnMIZpxRJidZl+RtPVe+FUZBi7moRglIiodLBtKwmf1oEV4xqb5V+mh/L Ugv88bzt/3W9tu/LDyR9Ve1JBjk3ghhMOMhvuEYNmO1szy9MNLn7zjJIf9MKdbIT/h Y4mOVdujWbYLgUJb5+1Frs1BuqhFX3eWJC3Rsdvc= Date: Mon, 24 Jun 2019 11:40:07 +0100 From: Will Deacon To: Guo Ren Subject: Re: [PATCH RFC 11/14] arm64: Move the ASID allocator code in a separate file Message-ID: <20190624104006.lvm32nahemaqklxc@willie-the-truck> References: <20190321163623.20219-1-julien.grall@arm.com> <20190321163623.20219-12-julien.grall@arm.com> <0dfe120b-066a-2ac8-13bc-3f5a29e2caa3@arm.com> <20190619091219.GB7767@fuggles.cambridge.arm.com> <20190619123939.GF7767@fuggles.cambridge.arm.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20170113 (1.7.2) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20190624_034014_496526_599133E8 X-CRM114-Status: GOOD ( 28.64 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Julien Grall , aou@eecs.berkeley.edu, Arnd Bergmann , suzuki.poulose@arm.com, Marc Zyngier , catalin.marinas@arm.com, julien.thierry@arm.com, Will Deacon , linux-kernel@vger.kernel.org, rppt@linux.ibm.com, hch@infradead.org, Atish.Patra@wdc.com, Anup Patel , james.morse@arm.com, gary@garyguo.net, Palmer Dabbelt , christoffer.dall@arm.com, paul.walmsley@sifive.com, linux-riscv@lists.infradead.org, kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-riscv" Errors-To: linux-riscv-bounces+infradead-linux-riscv=archiver.kernel.org@lists.infradead.org On Thu, Jun 20, 2019 at 05:33:03PM +0800, Guo Ren wrote: > On Wed, Jun 19, 2019 at 8:39 PM Will Deacon wrote: > > > > On Wed, Jun 19, 2019 at 08:18:04PM +0800, Guo Ren wrote: > > > On Wed, Jun 19, 2019 at 5:12 PM Will Deacon wrote: > > > > This is one place where I'd actually prefer not to go down the route of > > > > making the code generic. Context-switching and low-level TLB management > > > > is deeply architecture-specific and I worry that by trying to make this > > > > code common, we run the real risk of introducing subtle bugs on some > > > > architecture every time it is changed. > > > "Add generic asid code" and "move arm's into generic" are two things. > > > We could do > > > first and let architecture's maintainer to choose. > > > > If I understand the proposal being discussed, it involves basing that > > generic ASID allocation code around the arm64 implementation which I don't > > necessarily think is a good starting point. > ... > > > > > > Furthermore, the algorithm we use > > > > on arm64 is designed to scale to large systems using DVM and may well be > > > > too complex and/or sub-optimal for architectures with different system > > > > topologies or TLB invalidation mechanisms. > > > It's just a asid algorithm not very complex and there is a callback > > > for architecture to define their > > > own local hart tlb flush. Seems it has nothing with DVM or tlb > > > broadcast mechanism. > > > > I'm pleased that you think the algorithm is not very complex, but I'm also > > worried that you might not have fully understood some of its finer details. > I understand your concern about my less understanding of asid > technology. Here is > my short-description of arm64 asid allocator: (If you find anything > wrong, please > correct me directly, thx :) The complexity mainly comes from the fact that this thing runs concurrently with itself without synchronization on the fast-path. Coupled with the need to use the same ASID for all threads of a task, you end up in fiddly situations where rollover can occur on one CPU whilst another CPU is trying to schedule a thread of a task that already has threads running in userspace. However, it's architecture-specific whether or not you care about that scenario. > > The reason I mention DVM and TLB broadcasting is because, depending on > > the mechanisms in your architecture relating to those, it may be strictly > > required that all concurrently running threads of a process have the same > > ASID at any given point in time, or it may be that you really don't care. > > > > If you don't care, then the arm64 allocator is over-engineered and likely > > inefficient for your system. If you do care, then it's worth considering > > whether a lock is sufficient around the allocator if you don't expect high > > core counts. Another possibility is that you end up using only one ASID and > > invalidating the local TLB on every context switch. Yet another design > > would be to manage per-cpu ASID pools. > I'll keep my system use the same ASID for SMP + IOMMU :P You will want a separate allocator for that: https://lkml.kernel.org/r/20190610184714.6786-2-jean-philippe.brucker@arm.com > Yes, there are two styles of asid allocator: per-cpu ASID (MIPS) or > same ASID (ARM). > If the CPU couldn't support cache/tlb coherency maintian in hardware, > it should use > per-cpu ASID style because IPI is expensive and per-cpu ASID style > need more software > mechanism to improve performance (eg: delay cache flush). From software view the > same ASID is clearer and easier to build bigger system with more TLB caches. > > I think the same ASID style is a more sensible choice for modern > processor and let it be > one of generic is reasonable. I'm not sure I agree. x86, for example, is better off using a different algorithm for allocating its PCIDs. > > So rather than blindly copying the arm64 code, I suggest sitting down and > > designing something that fits to your architecture instead. You may end up > > with something that is both simpler and more efficient. > In fact, riscv folks have discussed a lot about arm's asid allocator > and I learned > a lot from the discussion: > https://lore.kernel.org/linux-riscv/20190327100201.32220-1-anup.patel@wdc.com/ If you require all threads of the same process to have the same ASID, then that patch looks broken to me. Will _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.8 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_NEOMUTT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DA262C4646C for ; Mon, 24 Jun 2019 11:20:20 +0000 (UTC) Received: from mm01.cs.columbia.edu (mm01.cs.columbia.edu [128.59.11.253]) by mail.kernel.org (Postfix) with ESMTP id 95F3C2133F for ; Mon, 24 Jun 2019 11:20:16 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="p4PZ9wV2" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 95F3C2133F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvmarm-bounces@lists.cs.columbia.edu Received: from localhost (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id 4D3E04A4A3; Mon, 24 Jun 2019 07:20:12 -0400 (EDT) X-Virus-Scanned: at lists.cs.columbia.edu Authentication-Results: mm01.cs.columbia.edu (amavisd-new); dkim=softfail (fail, message has been altered) header.i=@kernel.org Received: from mm01.cs.columbia.edu ([127.0.0.1]) by localhost (mm01.cs.columbia.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id LQQ7SubIwZ7O; Mon, 24 Jun 2019 07:20:11 -0400 (EDT) Received: from mm01.cs.columbia.edu (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id 04CDE4A509; Mon, 24 Jun 2019 07:20:10 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id A338C4A409 for ; Mon, 24 Jun 2019 06:40:16 -0400 (EDT) X-Virus-Scanned: at lists.cs.columbia.edu Received: from mm01.cs.columbia.edu ([127.0.0.1]) by localhost (mm01.cs.columbia.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6NMqzyzVae-w for ; Mon, 24 Jun 2019 06:40:15 -0400 (EDT) Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by mm01.cs.columbia.edu (Postfix) with ESMTPS id 1FB154A379 for ; Mon, 24 Jun 2019 06:40:15 -0400 (EDT) Received: from willie-the-truck (236.31.169.217.in-addr.arpa [217.169.31.236]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 95F05208E4; Mon, 24 Jun 2019 10:40:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1561372814; bh=HxUEH3z4leC4dJ5SSdD+Hrod9xxZtdkXEJIvmvfkDls=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=p4PZ9wV2TnMIZpxRJidZl+RtPVe+FUZBi7moRglIiodLBtKwmf1oEV4xqb5V+mh/L Ugv88bzt/3W9tu/LDyR9Ve1JBjk3ghhMOMhvuEYNmO1szy9MNLn7zjJIf9MKdbIT/h Y4mOVdujWbYLgUJb5+1Frs1BuqhFX3eWJC3Rsdvc= Date: Mon, 24 Jun 2019 11:40:07 +0100 From: Will Deacon To: Guo Ren Subject: Re: [PATCH RFC 11/14] arm64: Move the ASID allocator code in a separate file Message-ID: <20190624104006.lvm32nahemaqklxc@willie-the-truck> References: <20190321163623.20219-1-julien.grall@arm.com> <20190321163623.20219-12-julien.grall@arm.com> <0dfe120b-066a-2ac8-13bc-3f5a29e2caa3@arm.com> <20190619091219.GB7767@fuggles.cambridge.arm.com> <20190619123939.GF7767@fuggles.cambridge.arm.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20170113 (1.7.2) X-Mailman-Approved-At: Mon, 24 Jun 2019 07:20:08 -0400 Cc: Julien Grall , aou@eecs.berkeley.edu, Arnd Bergmann , Marc Zyngier , catalin.marinas@arm.com, Will Deacon , linux-kernel@vger.kernel.org, rppt@linux.ibm.com, hch@infradead.org, Atish.Patra@wdc.com, Anup Patel , gary@garyguo.net, Palmer Dabbelt , paul.walmsley@sifive.com, linux-riscv@lists.infradead.org, kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org X-BeenThere: kvmarm@lists.cs.columbia.edu X-Mailman-Version: 2.1.14 Precedence: list List-Id: Where KVM/ARM decisions are made List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: kvmarm-bounces@lists.cs.columbia.edu Sender: kvmarm-bounces@lists.cs.columbia.edu On Thu, Jun 20, 2019 at 05:33:03PM +0800, Guo Ren wrote: > On Wed, Jun 19, 2019 at 8:39 PM Will Deacon wrote: > > > > On Wed, Jun 19, 2019 at 08:18:04PM +0800, Guo Ren wrote: > > > On Wed, Jun 19, 2019 at 5:12 PM Will Deacon wrote: > > > > This is one place where I'd actually prefer not to go down the route of > > > > making the code generic. Context-switching and low-level TLB management > > > > is deeply architecture-specific and I worry that by trying to make this > > > > code common, we run the real risk of introducing subtle bugs on some > > > > architecture every time it is changed. > > > "Add generic asid code" and "move arm's into generic" are two things. > > > We could do > > > first and let architecture's maintainer to choose. > > > > If I understand the proposal being discussed, it involves basing that > > generic ASID allocation code around the arm64 implementation which I don't > > necessarily think is a good starting point. > ... > > > > > > Furthermore, the algorithm we use > > > > on arm64 is designed to scale to large systems using DVM and may well be > > > > too complex and/or sub-optimal for architectures with different system > > > > topologies or TLB invalidation mechanisms. > > > It's just a asid algorithm not very complex and there is a callback > > > for architecture to define their > > > own local hart tlb flush. Seems it has nothing with DVM or tlb > > > broadcast mechanism. > > > > I'm pleased that you think the algorithm is not very complex, but I'm also > > worried that you might not have fully understood some of its finer details. > I understand your concern about my less understanding of asid > technology. Here is > my short-description of arm64 asid allocator: (If you find anything > wrong, please > correct me directly, thx :) The complexity mainly comes from the fact that this thing runs concurrently with itself without synchronization on the fast-path. Coupled with the need to use the same ASID for all threads of a task, you end up in fiddly situations where rollover can occur on one CPU whilst another CPU is trying to schedule a thread of a task that already has threads running in userspace. However, it's architecture-specific whether or not you care about that scenario. > > The reason I mention DVM and TLB broadcasting is because, depending on > > the mechanisms in your architecture relating to those, it may be strictly > > required that all concurrently running threads of a process have the same > > ASID at any given point in time, or it may be that you really don't care. > > > > If you don't care, then the arm64 allocator is over-engineered and likely > > inefficient for your system. If you do care, then it's worth considering > > whether a lock is sufficient around the allocator if you don't expect high > > core counts. Another possibility is that you end up using only one ASID and > > invalidating the local TLB on every context switch. Yet another design > > would be to manage per-cpu ASID pools. > I'll keep my system use the same ASID for SMP + IOMMU :P You will want a separate allocator for that: https://lkml.kernel.org/r/20190610184714.6786-2-jean-philippe.brucker@arm.com > Yes, there are two styles of asid allocator: per-cpu ASID (MIPS) or > same ASID (ARM). > If the CPU couldn't support cache/tlb coherency maintian in hardware, > it should use > per-cpu ASID style because IPI is expensive and per-cpu ASID style > need more software > mechanism to improve performance (eg: delay cache flush). From software view the > same ASID is clearer and easier to build bigger system with more TLB caches. > > I think the same ASID style is a more sensible choice for modern > processor and let it be > one of generic is reasonable. I'm not sure I agree. x86, for example, is better off using a different algorithm for allocating its PCIDs. > > So rather than blindly copying the arm64 code, I suggest sitting down and > > designing something that fits to your architecture instead. You may end up > > with something that is both simpler and more efficient. > In fact, riscv folks have discussed a lot about arm's asid allocator > and I learned > a lot from the discussion: > https://lore.kernel.org/linux-riscv/20190327100201.32220-1-anup.patel@wdc.com/ If you require all threads of the same process to have the same ASID, then that patch looks broken to me. Will _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_NEOMUTT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0793C43613 for ; Mon, 24 Jun 2019 10:40:21 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 878C9208E4 for ; Mon, 24 Jun 2019 10:40:21 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="oyk2f6x3"; dkim=fail reason="signature verification failed" (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="p4PZ9wV2" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 878C9208E4 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+infradead-linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=6u3IuGBrAwe6y9sQGtY0nE5THevui0VUVO5453ncCBY=; b=oyk2f6x3+OCrah lUttg2ZiBBbf53NAG0xy/er1SlbRN7/oK1VaWBNtMbmXxN0+9qHpug2Jfj7LBkQ8DFyrsy2ERmQha ZhzM7JKh2BsqUyqDwGtH04RFis70rV/hwzQNE1ql9ygaq7/TXcPjGS3ksurzxqY4yMdFKrfYHMbbL QbJW76rffgD3Ss7qOEW7YTy68kYjcq12ggC+fSM3UeHd46FdEE5UwqarFSjOj9WgL8xym4voDcEmU YKYEm/ByU6Bukmt6m+AX+xlrgKwrMUacfhIoXmE8QLBUqHFPNFc8DVgjfMt+cRg+OXH9vZ5RIPcjd uD9IMpYmVpQ5wR/6QUDQ==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92 #3 (Red Hat Linux)) id 1hfMOX-0003UV-Sd; Mon, 24 Jun 2019 10:40:17 +0000 Received: from mail.kernel.org ([198.145.29.99]) by bombadil.infradead.org with esmtps (Exim 4.92 #3 (Red Hat Linux)) id 1hfMOU-0003U7-DO; Mon, 24 Jun 2019 10:40:16 +0000 Received: from willie-the-truck (236.31.169.217.in-addr.arpa [217.169.31.236]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 95F05208E4; Mon, 24 Jun 2019 10:40:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1561372814; bh=HxUEH3z4leC4dJ5SSdD+Hrod9xxZtdkXEJIvmvfkDls=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=p4PZ9wV2TnMIZpxRJidZl+RtPVe+FUZBi7moRglIiodLBtKwmf1oEV4xqb5V+mh/L Ugv88bzt/3W9tu/LDyR9Ve1JBjk3ghhMOMhvuEYNmO1szy9MNLn7zjJIf9MKdbIT/h Y4mOVdujWbYLgUJb5+1Frs1BuqhFX3eWJC3Rsdvc= Date: Mon, 24 Jun 2019 11:40:07 +0100 From: Will Deacon To: Guo Ren Subject: Re: [PATCH RFC 11/14] arm64: Move the ASID allocator code in a separate file Message-ID: <20190624104006.lvm32nahemaqklxc@willie-the-truck> References: <20190321163623.20219-1-julien.grall@arm.com> <20190321163623.20219-12-julien.grall@arm.com> <0dfe120b-066a-2ac8-13bc-3f5a29e2caa3@arm.com> <20190619091219.GB7767@fuggles.cambridge.arm.com> <20190619123939.GF7767@fuggles.cambridge.arm.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20170113 (1.7.2) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20190624_034014_496526_599133E8 X-CRM114-Status: GOOD ( 28.64 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Julien Grall , aou@eecs.berkeley.edu, Arnd Bergmann , suzuki.poulose@arm.com, Marc Zyngier , catalin.marinas@arm.com, julien.thierry@arm.com, Will Deacon , linux-kernel@vger.kernel.org, rppt@linux.ibm.com, hch@infradead.org, Atish.Patra@wdc.com, Anup Patel , james.morse@arm.com, gary@garyguo.net, Palmer Dabbelt , christoffer.dall@arm.com, paul.walmsley@sifive.com, linux-riscv@lists.infradead.org, kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+infradead-linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Thu, Jun 20, 2019 at 05:33:03PM +0800, Guo Ren wrote: > On Wed, Jun 19, 2019 at 8:39 PM Will Deacon wrote: > > > > On Wed, Jun 19, 2019 at 08:18:04PM +0800, Guo Ren wrote: > > > On Wed, Jun 19, 2019 at 5:12 PM Will Deacon wrote: > > > > This is one place where I'd actually prefer not to go down the route of > > > > making the code generic. Context-switching and low-level TLB management > > > > is deeply architecture-specific and I worry that by trying to make this > > > > code common, we run the real risk of introducing subtle bugs on some > > > > architecture every time it is changed. > > > "Add generic asid code" and "move arm's into generic" are two things. > > > We could do > > > first and let architecture's maintainer to choose. > > > > If I understand the proposal being discussed, it involves basing that > > generic ASID allocation code around the arm64 implementation which I don't > > necessarily think is a good starting point. > ... > > > > > > Furthermore, the algorithm we use > > > > on arm64 is designed to scale to large systems using DVM and may well be > > > > too complex and/or sub-optimal for architectures with different system > > > > topologies or TLB invalidation mechanisms. > > > It's just a asid algorithm not very complex and there is a callback > > > for architecture to define their > > > own local hart tlb flush. Seems it has nothing with DVM or tlb > > > broadcast mechanism. > > > > I'm pleased that you think the algorithm is not very complex, but I'm also > > worried that you might not have fully understood some of its finer details. > I understand your concern about my less understanding of asid > technology. Here is > my short-description of arm64 asid allocator: (If you find anything > wrong, please > correct me directly, thx :) The complexity mainly comes from the fact that this thing runs concurrently with itself without synchronization on the fast-path. Coupled with the need to use the same ASID for all threads of a task, you end up in fiddly situations where rollover can occur on one CPU whilst another CPU is trying to schedule a thread of a task that already has threads running in userspace. However, it's architecture-specific whether or not you care about that scenario. > > The reason I mention DVM and TLB broadcasting is because, depending on > > the mechanisms in your architecture relating to those, it may be strictly > > required that all concurrently running threads of a process have the same > > ASID at any given point in time, or it may be that you really don't care. > > > > If you don't care, then the arm64 allocator is over-engineered and likely > > inefficient for your system. If you do care, then it's worth considering > > whether a lock is sufficient around the allocator if you don't expect high > > core counts. Another possibility is that you end up using only one ASID and > > invalidating the local TLB on every context switch. Yet another design > > would be to manage per-cpu ASID pools. > I'll keep my system use the same ASID for SMP + IOMMU :P You will want a separate allocator for that: https://lkml.kernel.org/r/20190610184714.6786-2-jean-philippe.brucker@arm.com > Yes, there are two styles of asid allocator: per-cpu ASID (MIPS) or > same ASID (ARM). > If the CPU couldn't support cache/tlb coherency maintian in hardware, > it should use > per-cpu ASID style because IPI is expensive and per-cpu ASID style > need more software > mechanism to improve performance (eg: delay cache flush). From software view the > same ASID is clearer and easier to build bigger system with more TLB caches. > > I think the same ASID style is a more sensible choice for modern > processor and let it be > one of generic is reasonable. I'm not sure I agree. x86, for example, is better off using a different algorithm for allocating its PCIDs. > > So rather than blindly copying the arm64 code, I suggest sitting down and > > designing something that fits to your architecture instead. You may end up > > with something that is both simpler and more efficient. > In fact, riscv folks have discussed a lot about arm's asid allocator > and I learned > a lot from the discussion: > https://lore.kernel.org/linux-riscv/20190327100201.32220-1-anup.patel@wdc.com/ If you require all threads of the same process to have the same ASID, then that patch looks broken to me. Will _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel