From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1DC97C193FE for ; Thu, 3 Dec 2020 18:35:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C227F207A9 for ; Thu, 3 Dec 2020 18:35:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731792AbgLCSew (ORCPT ); Thu, 3 Dec 2020 13:34:52 -0500 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:26422 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731765AbgLCSeu (ORCPT ); Thu, 3 Dec 2020 13:34:50 -0500 Received: from pps.filterd (m0098399.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 0B3IVt3E020708; Thu, 3 Dec 2020 13:33:49 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=date : from : to : cc : subject : message-id : references : mime-version : content-type : content-transfer-encoding : in-reply-to; s=pp1; bh=TZJYX0Xz4UjLTS/rfkcr8IlGL0o4uzxTOtku92jtwoI=; b=NfVpGea8w/urP+0xgSu8wxf+gdvh/U9VpdAgRmDV/BYhdSkDTgZuI5WETvZMzmQH0f0s 6aodtW5hM+tXF1P1ifdywCxCWC739OrsJym2RAyBb8JynxWz5GlcnN5vA3evZorrsORu r3tm7YWaKF+GTGtna0a/TQ8xxljtbOffZk4walLTYWRylxkl2LN1mm2jWuGv6OS9j9ZQ icpOh09siL9FBldltwygdBtXOyZj7JQbYWSYpO/+GSpfkqF4qKSurnWaLY6pnN8AD8Kk fZGhPeQ5U74AgwZpXFo5fw/G71TwLj8rHaQ+uz7tRCePeC7u3JUg7kD+tkpxPC3CeVvQ IQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3573jf3ppm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 03 Dec 2020 13:33:48 -0500 Received: from m0098399.ppops.net (m0098399.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 0B3IWmnw024120; Thu, 3 Dec 2020 13:33:48 -0500 Received: from ppma04ams.nl.ibm.com (63.31.33a9.ip4.static.sl-reverse.com [169.51.49.99]) by mx0a-001b2d01.pphosted.com with ESMTP id 3573jf3png-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 03 Dec 2020 13:33:48 -0500 Received: from pps.filterd (ppma04ams.nl.ibm.com [127.0.0.1]) by ppma04ams.nl.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 0B3ICY7q021745; Thu, 3 Dec 2020 18:33:45 GMT Received: from b06cxnps3074.portsmouth.uk.ibm.com (d06relay09.portsmouth.uk.ibm.com [9.149.109.194]) by ppma04ams.nl.ibm.com with ESMTP id 35693xhgu7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 03 Dec 2020 18:33:45 +0000 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 0B3IXg9c24314226 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 3 Dec 2020 18:33:42 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 79EE6AE051; Thu, 3 Dec 2020 18:33:42 +0000 (GMT) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 86250AE04D; Thu, 3 Dec 2020 18:33:41 +0000 (GMT) Received: from oc3871087118.ibm.com (unknown [9.145.157.245]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTPS; Thu, 3 Dec 2020 18:33:41 +0000 (GMT) Date: Thu, 3 Dec 2020 19:33:40 +0100 From: Alexander Gordeev To: Andy Lutomirski Cc: Andy Lutomirski , Will Deacon , Catalin Marinas , Heiko Carstens , Vasily Gorbik , Christian Borntraeger , Dave Hansen , Nicholas Piggin , LKML , X86 ML , Mathieu Desnoyers , Arnd Bergmann , Peter Zijlstra , linux-arch , linuxppc-dev , Linux-MM , Anton Blanchard Subject: Re: [PATCH 6/8] lazy tlb: shoot lazies, a non-refcounting lazy tlb option Message-ID: <20201203183339.GA29470@oc3871087118.ibm.com> References: <20201203170332.GA27195@oc3871087118.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.312,18.0.737 definitions=2020-12-03_10:2020-12-03,2020-12-03 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 impostorscore=0 suspectscore=0 phishscore=0 clxscore=1011 malwarescore=0 adultscore=0 lowpriorityscore=0 mlxscore=0 mlxlogscore=999 spamscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2012030107 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Dec 03, 2020 at 09:14:22AM -0800, Andy Lutomirski wrote: > > > > On Dec 3, 2020, at 9:09 AM, Alexander Gordeev wrote: > > > > On Mon, Nov 30, 2020 at 10:31:51AM -0800, Andy Lutomirski wrote: > >> other arch folk: there's some background here: > > > > >> > >> power: Ridiculously complicated, seems to vary by system and kernel config. > >> > >> So, Nick, your unconditional IPI scheme is apparently a big > >> improvement for power, and it should be an improvement and have low > >> cost for x86. On arm64 and s390x it will add more IPIs on process > >> exit but reduce contention on context switching depending on how lazy > > > > s390 does not invalidate TLBs per-CPU explicitly - we have special > > instructions for that. Those in turn initiate signalling to other > > CPUs, completely transparent to OS. > > Just to make sure I understand: this means that you broadcast flushes to all CPUs, not just a subset? Correct. If mm has one CPU attached we flush TLB only for that CPU. If mm has more than one CPUs attached we flush all CPUs' TLBs. In fact, details are bit more complicated, since the hardware is able to flush subsets of TLB entries depending on provided parameters (e.g page tables used to create that entries). But we can not select a CPU subset. > > Apart from mm_count, I am struggling to realize how the suggested > > scheme could change the the contention on s390 in connection with > > TLB. Could you clarify a bit here, please? > > I’m just talking about mm_count. Maintaining mm_count is quite expensive on some workloads. > > > > >> TLB works. I suppose we could try it for all architectures without > >> any further optimizations. Or we could try one of the perhaps > >> excessively clever improvements I linked above. arm64, s390x people, > >> what do you think? > > > > I do not immediately see anything in the series that would harm > > performance on s390. > > > > We however use mm_cpumask to distinguish between local and global TLB > > flushes. With this series it looks like mm_cpumask is *required* to > > be consistent with lazy users. And that is something quite diffucult > > for us to adhere (at least in the foreseeable future). > > You don’t actually need to maintain mm_cpumask — we could scan all CPUs instead. > > > > > But actually keeping track of lazy users in a cpumask is something > > the generic code would rather do AFAICT. > > The problem is that arches don’t agree on what the contents of mm_cpumask should be. Tracking a mask of exactly what the arch wants in generic code is a nontrivial operation. It could be yet another cpumask or the CPU scan you mentioned. Just wanted to make sure there is no new requirement for an arch to maintain mm_cpumask ;) Thanks, Andy! From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.0 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 95394C4361B for ; Thu, 3 Dec 2020 18:36:41 +0000 (UTC) Received: from lists.ozlabs.org (lists.ozlabs.org [203.11.71.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 61307207B8 for ; Thu, 3 Dec 2020 18:36:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 61307207B8 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.ibm.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Received: from bilbo.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 4Cn4LF6PqzzDrN1 for ; Fri, 4 Dec 2020 05:36:37 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=linux.ibm.com (client-ip=148.163.156.1; helo=mx0a-001b2d01.pphosted.com; envelope-from=agordeev@linux.ibm.com; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=ibm.com header.i=@ibm.com header.a=rsa-sha256 header.s=pp1 header.b=NfVpGea8; dkim-atps=neutral Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4Cn4HL4j6czDqjQ for ; Fri, 4 Dec 2020 05:34:06 +1100 (AEDT) Received: from pps.filterd (m0098399.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 0B3IVt3E020708; Thu, 3 Dec 2020 13:33:49 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=date : from : to : cc : subject : message-id : references : mime-version : content-type : content-transfer-encoding : in-reply-to; s=pp1; bh=TZJYX0Xz4UjLTS/rfkcr8IlGL0o4uzxTOtku92jtwoI=; b=NfVpGea8w/urP+0xgSu8wxf+gdvh/U9VpdAgRmDV/BYhdSkDTgZuI5WETvZMzmQH0f0s 6aodtW5hM+tXF1P1ifdywCxCWC739OrsJym2RAyBb8JynxWz5GlcnN5vA3evZorrsORu r3tm7YWaKF+GTGtna0a/TQ8xxljtbOffZk4walLTYWRylxkl2LN1mm2jWuGv6OS9j9ZQ icpOh09siL9FBldltwygdBtXOyZj7JQbYWSYpO/+GSpfkqF4qKSurnWaLY6pnN8AD8Kk fZGhPeQ5U74AgwZpXFo5fw/G71TwLj8rHaQ+uz7tRCePeC7u3JUg7kD+tkpxPC3CeVvQ IQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3573jf3ppm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 03 Dec 2020 13:33:48 -0500 Received: from m0098399.ppops.net (m0098399.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 0B3IWmnw024120; Thu, 3 Dec 2020 13:33:48 -0500 Received: from ppma04ams.nl.ibm.com (63.31.33a9.ip4.static.sl-reverse.com [169.51.49.99]) by mx0a-001b2d01.pphosted.com with ESMTP id 3573jf3png-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 03 Dec 2020 13:33:48 -0500 Received: from pps.filterd (ppma04ams.nl.ibm.com [127.0.0.1]) by ppma04ams.nl.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 0B3ICY7q021745; Thu, 3 Dec 2020 18:33:45 GMT Received: from b06cxnps3074.portsmouth.uk.ibm.com (d06relay09.portsmouth.uk.ibm.com [9.149.109.194]) by ppma04ams.nl.ibm.com with ESMTP id 35693xhgu7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 03 Dec 2020 18:33:45 +0000 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 0B3IXg9c24314226 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 3 Dec 2020 18:33:42 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 79EE6AE051; Thu, 3 Dec 2020 18:33:42 +0000 (GMT) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 86250AE04D; Thu, 3 Dec 2020 18:33:41 +0000 (GMT) Received: from oc3871087118.ibm.com (unknown [9.145.157.245]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTPS; Thu, 3 Dec 2020 18:33:41 +0000 (GMT) Date: Thu, 3 Dec 2020 19:33:40 +0100 From: Alexander Gordeev To: Andy Lutomirski Subject: Re: [PATCH 6/8] lazy tlb: shoot lazies, a non-refcounting lazy tlb option Message-ID: <20201203183339.GA29470@oc3871087118.ibm.com> References: <20201203170332.GA27195@oc3871087118.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.312, 18.0.737 definitions=2020-12-03_10:2020-12-03, 2020-12-03 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 impostorscore=0 suspectscore=0 phishscore=0 clxscore=1011 malwarescore=0 adultscore=0 lowpriorityscore=0 mlxscore=0 mlxlogscore=999 spamscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2012030107 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arch , linuxppc-dev , Arnd Bergmann , Vasily Gorbik , Dave Hansen , Peter Zijlstra , Catalin Marinas , Heiko Carstens , X86 ML , LKML , Nicholas Piggin , Linux-MM , Christian Borntraeger , Mathieu Desnoyers , Andy Lutomirski , Will Deacon Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" On Thu, Dec 03, 2020 at 09:14:22AM -0800, Andy Lutomirski wrote: > > > > On Dec 3, 2020, at 9:09 AM, Alexander Gordeev wrote: > > > > On Mon, Nov 30, 2020 at 10:31:51AM -0800, Andy Lutomirski wrote: > >> other arch folk: there's some background here: > > > > >> > >> power: Ridiculously complicated, seems to vary by system and kernel config. > >> > >> So, Nick, your unconditional IPI scheme is apparently a big > >> improvement for power, and it should be an improvement and have low > >> cost for x86. On arm64 and s390x it will add more IPIs on process > >> exit but reduce contention on context switching depending on how lazy > > > > s390 does not invalidate TLBs per-CPU explicitly - we have special > > instructions for that. Those in turn initiate signalling to other > > CPUs, completely transparent to OS. > > Just to make sure I understand: this means that you broadcast flushes to all CPUs, not just a subset? Correct. If mm has one CPU attached we flush TLB only for that CPU. If mm has more than one CPUs attached we flush all CPUs' TLBs. In fact, details are bit more complicated, since the hardware is able to flush subsets of TLB entries depending on provided parameters (e.g page tables used to create that entries). But we can not select a CPU subset. > > Apart from mm_count, I am struggling to realize how the suggested > > scheme could change the the contention on s390 in connection with > > TLB. Could you clarify a bit here, please? > > I’m just talking about mm_count. Maintaining mm_count is quite expensive on some workloads. > > > > >> TLB works. I suppose we could try it for all architectures without > >> any further optimizations. Or we could try one of the perhaps > >> excessively clever improvements I linked above. arm64, s390x people, > >> what do you think? > > > > I do not immediately see anything in the series that would harm > > performance on s390. > > > > We however use mm_cpumask to distinguish between local and global TLB > > flushes. With this series it looks like mm_cpumask is *required* to > > be consistent with lazy users. And that is something quite diffucult > > for us to adhere (at least in the foreseeable future). > > You don’t actually need to maintain mm_cpumask — we could scan all CPUs instead. > > > > > But actually keeping track of lazy users in a cpumask is something > > the generic code would rather do AFAICT. > > The problem is that arches don’t agree on what the contents of mm_cpumask should be. Tracking a mask of exactly what the arch wants in generic code is a nontrivial operation. It could be yet another cpumask or the CPU scan you mentioned. Just wanted to make sure there is no new requirement for an arch to maintain mm_cpumask ;) Thanks, Andy!