From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 84CC8C77B73 for ; Thu, 27 Apr 2023 14:12:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243869AbjD0OMb (ORCPT ); Thu, 27 Apr 2023 10:12:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33662 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243830AbjD0OM2 (ORCPT ); Thu, 27 Apr 2023 10:12:28 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 22FA71FCE for ; Thu, 27 Apr 2023 07:12:27 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id DB98F143D; Thu, 27 Apr 2023 07:13:10 -0700 (PDT) Received: from [10.1.196.177] (eglon.cambridge.arm.com [10.1.196.177]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 1F8D63F7D8; Thu, 27 Apr 2023 07:12:15 -0700 (PDT) Message-ID: <046af0e6-8e9a-ca74-048a-c0c9144ebb62@arm.com> Date: Thu, 27 Apr 2023 15:12:12 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux aarch64; rv:91.0) Gecko/20100101 Thunderbird/91.13.0 Subject: Re: [PATCH v3 09/19] x86/resctrl: Queue mon_event_read() instead of sending an IPI Content-Language: en-GB To: Reinette Chatre , x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com References: <20230320172620.18254-1-james.morse@arm.com> <20230320172620.18254-10-james.morse@arm.com> <5e6a2e0a-6f31-c9b0-5eec-346fd072d286@intel.com> From: James Morse In-Reply-To: <5e6a2e0a-6f31-c9b0-5eec-346fd072d286@intel.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Reinette, On 01/04/2023 00:25, Reinette Chatre wrote: > On 3/20/2023 10:26 AM, James Morse wrote: >> x86 is blessed with an abundance of monitors, one per RMID, that can be >> read from any CPU in the domain. MPAMs monitors reside in the MMIO MSC, >> the number implemented is up to the manufacturer. This means when there are >> fewer monitors than needed, they need to be allocated and freed. >> >> Worse, the domain may be broken up into slices, and the MMIO accesses >> for each slice may need performing from different CPUs. >> >> These two details mean MPAMs monitor code needs to be able to sleep, and >> IPI another CPU in the domain to read from a resource that has been sliced. >> >> mon_event_read() already invokes mon_event_count() via IPI, which means >> this isn't possible. On systems using nohz-full, some CPUs need to be >> interrupted to run kernel work as they otherwise stay in user-space >> running realtime workloads. Interrupting these CPUs should be avoided, >> and scheduling work on them may never complete. >> >> Change mon_event_read() to pick a housekeeping CPU, (one that is not using >> nohz_full) and schedule mon_event_count() and wait. If all the CPUs >> in a domain are using nohz-full, then an IPI is used as the fallback. > > It is not clear to me where in this solution an IPI is used as fallback ... > (see below) >> @@ -537,7 +543,16 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r, >> rr->val = 0; >> rr->first = first; >> >> - smp_call_function_any(&d->cpu_mask, mon_event_count, rr, 1); >> + cpu = get_cpu(); >> + if (cpumask_test_cpu(cpu, &d->cpu_mask)) { >> + mon_event_count(rr); >> + put_cpu(); >> + } else { >> + put_cpu(); >> + >> + cpu = cpumask_any_housekeeping(&d->cpu_mask); >> + smp_call_on_cpu(cpu, mon_event_count, rr, false); >> + } >> } >> > > ... from what I can tell there is no IPI fallback here. As per previous > patch I understand cpumask_any_housekeeping() could still return > a nohz_full CPU and calling smp_call_on_cpu() on it would not send > an IPI but instead queue the work to it. What did I miss? Huh, looks like its still in my git-stash. Sorry about that. The combined hunk looks like this: ----------------------%<---------------------- @@ -537,7 +550,26 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r, rr->val = 0; rr->first = first; - smp_call_function_any(&d->cpu_mask, mon_event_count, rr, 1); + cpu = get_cpu(); + if (cpumask_test_cpu(cpu, &d->cpu_mask)) { + mon_event_count(rr); + put_cpu(); + } else { + put_cpu(); + + cpu = cpumask_any_housekeeping(&d->cpu_mask); + + /* + * cpumask_any_housekeeping() prefers housekeeping CPUs, but + * are all the CPUs nohz_full? If yes, pick a CPU to IPI. + * MPAM's resctrl_arch_rmid_read() is unable to read the + * counters on some platforms if its called in irq context. + */ + if (tick_nohz_full_cpu(cpu)) + smp_call_function_any(&d->cpu_mask, mon_event_count, rr, 1); + else + smp_call_on_cpu(cpu, smp_mon_event_count, rr, false); + } } ----------------------%<---------------------- Where smp_mon_event_count() is a static wrapper to make the types work. Thanks, James