From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 03CF9C433EF for ; Wed, 30 Mar 2022 15:48:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348403AbiC3Pu3 (ORCPT ); Wed, 30 Mar 2022 11:50:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51486 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1348387AbiC3PuW (ORCPT ); Wed, 30 Mar 2022 11:50:22 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id ACCD72E095 for ; Wed, 30 Mar 2022 08:48:37 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4945223A; Wed, 30 Mar 2022 08:48:37 -0700 (PDT) Received: from [192.168.178.6] (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 256963F73B; Wed, 30 Mar 2022 08:48:36 -0700 (PDT) Message-ID: <5dc3a40e-f071-3ac8-4bf0-f555b9d94ff1@arm.com> Date: Wed, 30 Mar 2022 17:48:34 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0 Subject: Re: [PATCH] arch/arm64: Fix topology initialization for core scheduling Content-Language: en-US To: Phil Auld Cc: linux-kernel@vger.kernel.org, Catalin Marinas , Will Deacon , Mark Rutland , Peter Zijlstra , linux-arm-kernel@lists.infradead.org References: <20220322160304.26229-1-pauld@redhat.com> <1a546197-872b-7762-68ac-d5e6bb6d19aa@arm.com> <5a5381cd-813d-7cef-9948-01c3e5e910ef@arm.com> From: Dietmar Eggemann In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 29/03/2022 21:50, Phil Auld wrote: > On Tue, Mar 29, 2022 at 08:55:08PM +0200 Dietmar Eggemann wrote: >> On 29/03/2022 17:20, Phil Auld wrote: >>> On Tue, Mar 29, 2022 at 04:02:22PM +0200 Dietmar Eggemann wrote: >>>> On 22/03/2022 17:03, Phil Auld wrote: [...] >>> This instance is an HPE Apollo 70 set to smt-4. I believe it's ThunderX2 >>> chips. >>> >>> ARM (CN9980-2200LG4077-Y21-G) >> I'm using the same processor just with ACPI/PPTT. >> > > Maybe I'm misinformed about these systems having no PPTT... > > I'm reclaiming the system. Is there a way I can tell from userspace? # cat /sys/firmware/acpi/tables/PPTT > pptt.dat # iasl -d pptt.dat # vim pptt.dsl [...] >> so no SMT sched domain. The MPIDR-based topology fallback code in >> store_cpu_topology() forces `cpuid_topo->thread_id = -1`. > > Right. So since I'm getting SMT it must not have package_id == -1. > In which case you should be able to reproduce it because it must > be that the call the update_siblings_masks() is required. That > appears to only be called from store_cpu_topology() which is > after the scheduler has already setup the core pointers. > > The fix could be the same but I should reword the commit message > since it should effect all SMT arm systems I'd think. > > Or maybe the ACPI topology code should call update_sibling_masks(). >> >> IMHO this is why on my machine I don't see this issue while running: >> >> root@oss-apollo7007:~# stress-ng --prctl 256 -t 60 >> stress-ng: info: [2388042] dispatching hogs: 256 prctl >> >> Is there something I miss in my setup to provoke this issue? >> > > Make sure you have a stress-ng that is new enough and built against > headers that have the CORE_SCHED prctls defined. Ah, I was using a pretty old version 0.11.07. Now I switched to 0.13.12 which includes: 9038e442b92d - stress-prctl: add Linux 5.14 PR_SCHED_CORE prctl To get SCHED_CORE activated in stress-prctl.c, as a quick hack, I had to add the definitions of PR_SCHED_CORE, PR_SCHED_CORE_GET, etc. to this file. Now the issue you described triggers on this machine immediately. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 51A42C433F5 for ; Wed, 30 Mar 2022 16:00:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:From:References:Cc:To: Subject:MIME-Version:Date:Message-ID:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=tnIp68BKKuD7uYHzL5VKCOqtUTAybZyzbiXAvfED2RY=; b=DEEiYDkmIEfjOh xPydbfY6TEDjtrYlREnOnOASR0Y8624hXDNdlAj24fk0SXfCstiMR5MGaaK+KX1BKu160Cx/FRPqj wM/cLNFCnqdxzKv3mJWe+JZWSdf+m5OEPtJRAMXF1i9gxXHRBfULRmiPAs1C07yQIOa+4QmFRoI78 JTR33g1YJy3bjE7B9mgFu/iMkU1QhuEczcQZheuPKDmESwkp0OBcoXJ4jy47meg/8BjD4woIWCE6j Q8jZkN5LFTwE3HcIwHnsgdCRDJWROmJKzmimjWxW1R0tu6aR3WRsZAQhal+5ttLQTkm1vBKjFuaxK ZRZzYTvzc8vLjYxbrYeQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1nZaiG-00GiG2-OI; Wed, 30 Mar 2022 15:58:25 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1nZaYp-00GeHN-U8 for linux-arm-kernel@lists.infradead.org; Wed, 30 Mar 2022 15:48:41 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4945223A; Wed, 30 Mar 2022 08:48:37 -0700 (PDT) Received: from [192.168.178.6] (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 256963F73B; Wed, 30 Mar 2022 08:48:36 -0700 (PDT) Message-ID: <5dc3a40e-f071-3ac8-4bf0-f555b9d94ff1@arm.com> Date: Wed, 30 Mar 2022 17:48:34 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0 Subject: Re: [PATCH] arch/arm64: Fix topology initialization for core scheduling Content-Language: en-US To: Phil Auld Cc: linux-kernel@vger.kernel.org, Catalin Marinas , Will Deacon , Mark Rutland , Peter Zijlstra , linux-arm-kernel@lists.infradead.org References: <20220322160304.26229-1-pauld@redhat.com> <1a546197-872b-7762-68ac-d5e6bb6d19aa@arm.com> <5a5381cd-813d-7cef-9948-01c3e5e910ef@arm.com> From: Dietmar Eggemann In-Reply-To: X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220330_084840_089704_563BEA29 X-CRM114-Status: GOOD ( 22.16 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 29/03/2022 21:50, Phil Auld wrote: > On Tue, Mar 29, 2022 at 08:55:08PM +0200 Dietmar Eggemann wrote: >> On 29/03/2022 17:20, Phil Auld wrote: >>> On Tue, Mar 29, 2022 at 04:02:22PM +0200 Dietmar Eggemann wrote: >>>> On 22/03/2022 17:03, Phil Auld wrote: [...] >>> This instance is an HPE Apollo 70 set to smt-4. I believe it's ThunderX2 >>> chips. >>> >>> ARM (CN9980-2200LG4077-Y21-G) >> I'm using the same processor just with ACPI/PPTT. >> > > Maybe I'm misinformed about these systems having no PPTT... > > I'm reclaiming the system. Is there a way I can tell from userspace? # cat /sys/firmware/acpi/tables/PPTT > pptt.dat # iasl -d pptt.dat # vim pptt.dsl [...] >> so no SMT sched domain. The MPIDR-based topology fallback code in >> store_cpu_topology() forces `cpuid_topo->thread_id = -1`. > > Right. So since I'm getting SMT it must not have package_id == -1. > In which case you should be able to reproduce it because it must > be that the call the update_siblings_masks() is required. That > appears to only be called from store_cpu_topology() which is > after the scheduler has already setup the core pointers. > > The fix could be the same but I should reword the commit message > since it should effect all SMT arm systems I'd think. > > Or maybe the ACPI topology code should call update_sibling_masks(). >> >> IMHO this is why on my machine I don't see this issue while running: >> >> root@oss-apollo7007:~# stress-ng --prctl 256 -t 60 >> stress-ng: info: [2388042] dispatching hogs: 256 prctl >> >> Is there something I miss in my setup to provoke this issue? >> > > Make sure you have a stress-ng that is new enough and built against > headers that have the CORE_SCHED prctls defined. Ah, I was using a pretty old version 0.11.07. Now I switched to 0.13.12 which includes: 9038e442b92d - stress-prctl: add Linux 5.14 PR_SCHED_CORE prctl To get SCHED_CORE activated in stress-prctl.c, as a quick hack, I had to add the definitions of PR_SCHED_CORE, PR_SCHED_CORE_GET, etc. to this file. Now the issue you described triggers on this machine immediately. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel