From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C7630C4332F for ; Fri, 4 Nov 2022 08:41:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231307AbiKDIlX (ORCPT ); Fri, 4 Nov 2022 04:41:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55840 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229493AbiKDIlV (ORCPT ); Fri, 4 Nov 2022 04:41:21 -0400 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 03EBE275F4; Fri, 4 Nov 2022 01:41:19 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 821FC1F88F; Fri, 4 Nov 2022 08:41:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1667551278; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4eYXI+v91C6UMV4JDFYPXhsb0OglU6TPm28ARvvnQc0=; b=ignxq6jJhyE/tP3iqNKVyIiU2gyTbpXAnLQ+1lHoEkIuWks2akbLfofAqjF2DaKV8sA1Kx FYeeJibPxdNJ1SNhKyCHXBpg/Xc3DoyVSssg1TULNGwebyqN90bDOIaITJRFBBxGDGkhXs f/8b3x8DoQCzaDnzbi9VGaVO1Uz+WIo= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 5B12513216; Fri, 4 Nov 2022 08:41:18 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id K3vmEy7QZGM6ZgAAMHmgww (envelope-from ); Fri, 04 Nov 2022 08:41:18 +0000 Date: Fri, 4 Nov 2022 09:41:17 +0100 From: Michal Hocko To: Leonardo =?iso-8859-1?Q?Br=E1s?= Cc: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Valentin Schneider , Johannes Weiner , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , Frederic Weisbecker , Phil Auld , Marcelo Tosatti , linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v1 0/3] Avoid scheduling cache draining to isolated cpus Message-ID: References: <20221102020243.522358-1-leobras@redhat.com> <07810c49ef326b26c971008fb03adf9dc533a178.camel@redhat.com> <0183b60e79cda3a0f992d14b4db5a818cd096e33.camel@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <0183b60e79cda3a0f992d14b4db5a818cd096e33.camel@redhat.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu 03-11-22 13:53:41, Leonardo Brás wrote: > On Thu, 2022-11-03 at 16:31 +0100, Michal Hocko wrote: > > On Thu 03-11-22 11:59:20, Leonardo Brás wrote: [...] > > > I understand there will be a locking cost being paid in the isolated CPUs when: > > > a) The isolated CPU is requesting the stock drain, > > > b) When the isolated CPUs do a syscall and end up using the protected structure > > > the first time after a remote drain. > > > > And anytime the charging path (consume_stock resp. refill_stock) > > contends with the remote draining which is out of control of the RT > > task. It is true that the RT kernel will turn that spin lock into a > > sleeping RT lock and that could help with potential priority inversions > > but still quite costly thing I would expect. > > > > > Both (a) and (b) should happen during a syscall, and IIUC the a rt workload > > > should not expect the syscalls to be have a predictable time, so it should be > > > fine. > > > > Now I am not sure I understand. If you do not consider charging path to > > be RT sensitive then why is this needed in the first place? What else > > would be populating the pcp cache on the isolated cpu? IRQs? > > I am mostly trying to deal with drain_all_stock() calling schedule_work_on() at > isolated_cpus. Since the scheduled drain_local_stock() will be competing for cpu > time with the RT workload, we can have preemption of the RT workload, which is a > problem for meeting the deadlines. Yes, this is understood. But it is not really clear to me why would any draining be necessary for such an isolated CPU if no workload other than the RT (which pressumably doesn't charge any memory?) is running on that CPU? Is that the RT task during the initialization phase that leaves that cache behind or something else? Sorry for being so focused on this but I would like to understand on whether this is avoidable by a different startup scheme or it really needs to be addressed in some way. > One way I thought to solve that was introducing a remote drain, which would > require a different strategy for locking, since not all accesses to the pcp > caches would happen on a local CPU. Yeah, I am not supper happy about additional spin lock TBH. One potential way to go would be to completely avoid pcp cache for isolated CPUs. That would have some performance impact of course but on the other hand it would give a more predictable behavior for those CPUs which sounds like a reasonable compromise to me. What do you think? -- Michal Hocko SUSE Labs From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michal Hocko Subject: Re: [PATCH v1 0/3] Avoid scheduling cache draining to isolated cpus Date: Fri, 4 Nov 2022 09:41:17 +0100 Message-ID: References: <20221102020243.522358-1-leobras@redhat.com> <07810c49ef326b26c971008fb03adf9dc533a178.camel@redhat.com> <0183b60e79cda3a0f992d14b4db5a818cd096e33.camel@redhat.com> Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1667551278; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4eYXI+v91C6UMV4JDFYPXhsb0OglU6TPm28ARvvnQc0=; b=ignxq6jJhyE/tP3iqNKVyIiU2gyTbpXAnLQ+1lHoEkIuWks2akbLfofAqjF2DaKV8sA1Kx FYeeJibPxdNJ1SNhKyCHXBpg/Xc3DoyVSssg1TULNGwebyqN90bDOIaITJRFBBxGDGkhXs f/8b3x8DoQCzaDnzbi9VGaVO1Uz+WIo= Content-Disposition: inline In-Reply-To: <0183b60e79cda3a0f992d14b4db5a818cd096e33.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> List-ID: Content-Type: text/plain; charset="iso-8859-1" To: Leonardo =?iso-8859-1?Q?Br=E1s?= Cc: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Valentin Schneider , Johannes Weiner , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , Frederic Weisbecker , Phil Auld , Marcelo Tosatti , linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On Thu 03-11-22 13:53:41, Leonardo Br=E1s wrote: > On Thu, 2022-11-03 at 16:31 +0100, Michal Hocko wrote: > > On Thu 03-11-22 11:59:20, Leonardo Br=E1s wrote: [...] > > > I understand there will be a locking cost being paid in the isolated = CPUs when: > > > a) The isolated CPU is requesting the stock drain, > > > b) When the isolated CPUs do a syscall and end up using the protected= structure > > > the first time after a remote drain. > >=20 > > And anytime the charging path (consume_stock resp. refill_stock) > > contends with the remote draining which is out of control of the RT > > task. It is true that the RT kernel will turn that spin lock into a > > sleeping RT lock and that could help with potential priority inversions > > but still quite costly thing I would expect. > >=20 > > > Both (a) and (b) should happen during a syscall, and IIUC the a rt wo= rkload > > > should not expect the syscalls to be have a predictable time, so it s= hould be > > > fine. > >=20 > > Now I am not sure I understand. If you do not consider charging path to > > be RT sensitive then why is this needed in the first place? What else > > would be populating the pcp cache on the isolated cpu? IRQs? >=20 > I am mostly trying to deal with drain_all_stock() calling schedule_work_o= n() at > isolated_cpus. Since the scheduled drain_local_stock() will be competing = for cpu > time with the RT workload, we can have preemption of the RT workload, whi= ch is a > problem for meeting the deadlines. Yes, this is understood. But it is not really clear to me why would any draining be necessary for such an isolated CPU if no workload other than the RT (which pressumably doesn't charge any memory?) is running on that CPU? Is that the RT task during the initialization phase that leaves that cache behind or something else? Sorry for being so focused on this but I would like to understand on whether this is avoidable by a different startup scheme or it really needs to be addressed in some way. > One way I thought to solve that was introducing a remote drain, which wou= ld > require a different strategy for locking, since not all accesses to the p= cp > caches would happen on a local CPU.=20 Yeah, I am not supper happy about additional spin lock TBH. One potential way to go would be to completely avoid pcp cache for isolated CPUs. That would have some performance impact of course but on the other hand it would give a more predictable behavior for those CPUs which sounds like a reasonable compromise to me. What do you think? --=20 Michal Hocko SUSE Labs