All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sudeep Holla <sudeep.holla@arm.com>
To: Conor.Dooley@microchip.com
Cc: linux-kernel@vger.kernel.org, gregkh@linuxfoundation.org,
	atishp@atishpatra.org, atishp@rivosinc.com,
	vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
	wangqing@vivo.com, robh+dt@kernel.org, rafael@kernel.org,
	ionela.voinescu@arm.com, pierre.gondois@arm.com,
	linux-arm-kernel@lists.infradead.org,
	linux-riscv@lists.infradead.org, gshan@redhat.com,
	Valentina.FernandezAlanis@microchip.com
Subject: Re: [PATCH v5 09/19] arch_topology: Use the last level cache information from the cacheinfo
Date: Thu, 30 Jun 2022 21:07:17 +0100	[thread overview]
Message-ID: <20220630200717.zlc6z6zcqbsw7euk@bogus> (raw)
In-Reply-To: <3840dbf7-ca18-b7ab-4d7a-92c9305476fa@microchip.com>

On Thu, Jun 30, 2022 at 07:20:04PM +0000, Conor.Dooley@microchip.com wrote:
> 
> 
> On 30/06/2022 18:35, Sudeep Holla wrote:
> > On Thu, Jun 30, 2022 at 04:37:50PM +0000, Conor.Dooley@microchip.com wrote:
> >> On 30/06/2022 11:39, Sudeep Holla wrote:
> >>>
> >>> I can't think of any reason for that to happen unless detect_cache_attributes
> >>> is failing from init_cpu_topology and we are ignoring that.
> >>>
> >>> Are all RISC-V platforms failing on -next or is it just this platform ?
> >>
> >> I don't know. I only have SoCs with this core complex & one that does not
> >> work with upstream. I can try my other board with this SoC - but I am on
> >> leave at the moment w/ a computer or internet during the day so it may be
> >> a few days before I can try it.
> >>
> > 
> > Sure, no worries.
> > 
> >> However, Niklas Cassel has tried to use the Canaan K210 on next-20220630
> >> but had issues with RCU stalling:
> >> https://lore.kernel.org/linux-riscv/Yr3PKR0Uj1bE5Y6O@x1-carbon/T/#m52016996fcf5fa0501066d73352ed8e806803e06
> >> Not going to claim any relation, but that's minus 1 to the platforms that
> >> can be used to test this on upstream RISC-V.
> >>
> > 
> > Ah OK, will check and ask full logs to see if there is any relation.
> > 
> >>> We may have to try with some logs in detect_cache_attributes,
> >>> last_level_cache_is_valid and last_level_cache_is_shared to check where it
> >>> is going wrong.
> >>>
> >>> It must be crashing in smp_callin->update_siblings_masks->last_level_cache_is_shared
> 
> 
> So, looks like there's a problem in cache_leaves_are_shared() which is hit
> by the above path. Both of the if clauses are false, and the function falls
> through to return sib_leaf->fw_token == this_leaf->fw_token;

Both if() failing is expected and that statement
	return sib_leaf->fw_token == this_leaf->fw_token;
execution is correct.

> Both sib_leaf & this_leaf seem to be null.
>

But this is wrong as last_level_cache_is_shared checks for
last_level_cache_is_valid which must return false if the fw_token = NULL
So we must not hit the above return statement with NULL fw_token.

> static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
> 					   struct cacheinfo *sib_leaf)
> {
> 	/*
> 	 * For non DT/ACPI systems, assume unique level 1 caches,
> 	 * system-wide shared caches for all other levels. This will be used
> 	 * only if arch specific code has not populated shared_cpu_map
> 	 */
> 	if (!(IS_ENABLED(CONFIG_OF) || IS_ENABLED(CONFIG_ACPI)))
> 		return !(this_leaf->level == 1);
> 
> 	if ((sib_leaf->attributes & CACHE_ID) &&
> 	    (this_leaf->attributes & CACHE_ID))
> 		return sib_leaf->id == this_leaf->id;
> 
> 	return sib_leaf->fw_token == this_leaf->fw_token;
> }
> 
> Any ideas what to look at next?

I wonder how did we not get last_level_cache_is_valid as false if the
fw_node is NULL. But it should not be NULL at the first place.

-- 
Regards,
Sudeep

WARNING: multiple messages have this Message-ID (diff)
From: Sudeep Holla <sudeep.holla@arm.com>
To: Conor.Dooley@microchip.com
Cc: linux-kernel@vger.kernel.org, gregkh@linuxfoundation.org,
	atishp@atishpatra.org, atishp@rivosinc.com,
	vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
	wangqing@vivo.com, robh+dt@kernel.org, rafael@kernel.org,
	ionela.voinescu@arm.com, pierre.gondois@arm.com,
	linux-arm-kernel@lists.infradead.org,
	linux-riscv@lists.infradead.org, gshan@redhat.com,
	Valentina.FernandezAlanis@microchip.com
Subject: Re: [PATCH v5 09/19] arch_topology: Use the last level cache information from the cacheinfo
Date: Thu, 30 Jun 2022 21:07:17 +0100	[thread overview]
Message-ID: <20220630200717.zlc6z6zcqbsw7euk@bogus> (raw)
In-Reply-To: <3840dbf7-ca18-b7ab-4d7a-92c9305476fa@microchip.com>

On Thu, Jun 30, 2022 at 07:20:04PM +0000, Conor.Dooley@microchip.com wrote:
> 
> 
> On 30/06/2022 18:35, Sudeep Holla wrote:
> > On Thu, Jun 30, 2022 at 04:37:50PM +0000, Conor.Dooley@microchip.com wrote:
> >> On 30/06/2022 11:39, Sudeep Holla wrote:
> >>>
> >>> I can't think of any reason for that to happen unless detect_cache_attributes
> >>> is failing from init_cpu_topology and we are ignoring that.
> >>>
> >>> Are all RISC-V platforms failing on -next or is it just this platform ?
> >>
> >> I don't know. I only have SoCs with this core complex & one that does not
> >> work with upstream. I can try my other board with this SoC - but I am on
> >> leave at the moment w/ a computer or internet during the day so it may be
> >> a few days before I can try it.
> >>
> > 
> > Sure, no worries.
> > 
> >> However, Niklas Cassel has tried to use the Canaan K210 on next-20220630
> >> but had issues with RCU stalling:
> >> https://lore.kernel.org/linux-riscv/Yr3PKR0Uj1bE5Y6O@x1-carbon/T/#m52016996fcf5fa0501066d73352ed8e806803e06
> >> Not going to claim any relation, but that's minus 1 to the platforms that
> >> can be used to test this on upstream RISC-V.
> >>
> > 
> > Ah OK, will check and ask full logs to see if there is any relation.
> > 
> >>> We may have to try with some logs in detect_cache_attributes,
> >>> last_level_cache_is_valid and last_level_cache_is_shared to check where it
> >>> is going wrong.
> >>>
> >>> It must be crashing in smp_callin->update_siblings_masks->last_level_cache_is_shared
> 
> 
> So, looks like there's a problem in cache_leaves_are_shared() which is hit
> by the above path. Both of the if clauses are false, and the function falls
> through to return sib_leaf->fw_token == this_leaf->fw_token;

Both if() failing is expected and that statement
	return sib_leaf->fw_token == this_leaf->fw_token;
execution is correct.

> Both sib_leaf & this_leaf seem to be null.
>

But this is wrong as last_level_cache_is_shared checks for
last_level_cache_is_valid which must return false if the fw_token = NULL
So we must not hit the above return statement with NULL fw_token.

> static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
> 					   struct cacheinfo *sib_leaf)
> {
> 	/*
> 	 * For non DT/ACPI systems, assume unique level 1 caches,
> 	 * system-wide shared caches for all other levels. This will be used
> 	 * only if arch specific code has not populated shared_cpu_map
> 	 */
> 	if (!(IS_ENABLED(CONFIG_OF) || IS_ENABLED(CONFIG_ACPI)))
> 		return !(this_leaf->level == 1);
> 
> 	if ((sib_leaf->attributes & CACHE_ID) &&
> 	    (this_leaf->attributes & CACHE_ID))
> 		return sib_leaf->id == this_leaf->id;
> 
> 	return sib_leaf->fw_token == this_leaf->fw_token;
> }
> 
> Any ideas what to look at next?

I wonder how did we not get last_level_cache_is_valid as false if the
fw_node is NULL. But it should not be NULL at the first place.

-- 
Regards,
Sudeep

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

WARNING: multiple messages have this Message-ID (diff)
From: Sudeep Holla <sudeep.holla@arm.com>
To: Conor.Dooley@microchip.com
Cc: linux-kernel@vger.kernel.org, gregkh@linuxfoundation.org,
	atishp@atishpatra.org, atishp@rivosinc.com,
	vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
	wangqing@vivo.com, robh+dt@kernel.org, rafael@kernel.org,
	ionela.voinescu@arm.com, pierre.gondois@arm.com,
	linux-arm-kernel@lists.infradead.org,
	linux-riscv@lists.infradead.org, gshan@redhat.com,
	Valentina.FernandezAlanis@microchip.com
Subject: Re: [PATCH v5 09/19] arch_topology: Use the last level cache information from the cacheinfo
Date: Thu, 30 Jun 2022 21:07:17 +0100	[thread overview]
Message-ID: <20220630200717.zlc6z6zcqbsw7euk@bogus> (raw)
In-Reply-To: <3840dbf7-ca18-b7ab-4d7a-92c9305476fa@microchip.com>

On Thu, Jun 30, 2022 at 07:20:04PM +0000, Conor.Dooley@microchip.com wrote:
> 
> 
> On 30/06/2022 18:35, Sudeep Holla wrote:
> > On Thu, Jun 30, 2022 at 04:37:50PM +0000, Conor.Dooley@microchip.com wrote:
> >> On 30/06/2022 11:39, Sudeep Holla wrote:
> >>>
> >>> I can't think of any reason for that to happen unless detect_cache_attributes
> >>> is failing from init_cpu_topology and we are ignoring that.
> >>>
> >>> Are all RISC-V platforms failing on -next or is it just this platform ?
> >>
> >> I don't know. I only have SoCs with this core complex & one that does not
> >> work with upstream. I can try my other board with this SoC - but I am on
> >> leave at the moment w/ a computer or internet during the day so it may be
> >> a few days before I can try it.
> >>
> > 
> > Sure, no worries.
> > 
> >> However, Niklas Cassel has tried to use the Canaan K210 on next-20220630
> >> but had issues with RCU stalling:
> >> https://lore.kernel.org/linux-riscv/Yr3PKR0Uj1bE5Y6O@x1-carbon/T/#m52016996fcf5fa0501066d73352ed8e806803e06
> >> Not going to claim any relation, but that's minus 1 to the platforms that
> >> can be used to test this on upstream RISC-V.
> >>
> > 
> > Ah OK, will check and ask full logs to see if there is any relation.
> > 
> >>> We may have to try with some logs in detect_cache_attributes,
> >>> last_level_cache_is_valid and last_level_cache_is_shared to check where it
> >>> is going wrong.
> >>>
> >>> It must be crashing in smp_callin->update_siblings_masks->last_level_cache_is_shared
> 
> 
> So, looks like there's a problem in cache_leaves_are_shared() which is hit
> by the above path. Both of the if clauses are false, and the function falls
> through to return sib_leaf->fw_token == this_leaf->fw_token;

Both if() failing is expected and that statement
	return sib_leaf->fw_token == this_leaf->fw_token;
execution is correct.

> Both sib_leaf & this_leaf seem to be null.
>

But this is wrong as last_level_cache_is_shared checks for
last_level_cache_is_valid which must return false if the fw_token = NULL
So we must not hit the above return statement with NULL fw_token.

> static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
> 					   struct cacheinfo *sib_leaf)
> {
> 	/*
> 	 * For non DT/ACPI systems, assume unique level 1 caches,
> 	 * system-wide shared caches for all other levels. This will be used
> 	 * only if arch specific code has not populated shared_cpu_map
> 	 */
> 	if (!(IS_ENABLED(CONFIG_OF) || IS_ENABLED(CONFIG_ACPI)))
> 		return !(this_leaf->level == 1);
> 
> 	if ((sib_leaf->attributes & CACHE_ID) &&
> 	    (this_leaf->attributes & CACHE_ID))
> 		return sib_leaf->id == this_leaf->id;
> 
> 	return sib_leaf->fw_token == this_leaf->fw_token;
> }
> 
> Any ideas what to look at next?

I wonder how did we not get last_level_cache_is_valid as false if the
fw_node is NULL. But it should not be NULL at the first place.

-- 
Regards,
Sudeep

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2022-06-30 20:08 UTC|newest]

Thread overview: 144+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-27 16:50 [PATCH v5 00/19] arch_topology: Updates to add socket support and fix cluster ids Sudeep Holla
2022-06-27 16:50 ` Sudeep Holla
2022-06-27 16:50 ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 01/19] ACPI: PPTT: Use table offset as fw_token instead of virtual address Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 02/19] cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 03/19] cacheinfo: Add helper to access any cache index for a given CPU Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 04/19] cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 05/19] cacheinfo: Add support to check if last level cache(LLC) is valid or shared Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 06/19] cacheinfo: Allow early detection and population of cache attributes Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 07/19] cacheinfo: Use cache identifiers to check if the caches are shared if available Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 08/19] arch_topology: Add support to parse and detect cache attributes Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 09/19] arch_topology: Use the last level cache information from the cacheinfo Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-29 17:49   ` Conor.Dooley
2022-06-29 17:49     ` Conor.Dooley
2022-06-29 17:49     ` Conor.Dooley
2022-06-29 18:18     ` Conor.Dooley
2022-06-29 18:18       ` Conor.Dooley
2022-06-29 18:18       ` Conor.Dooley
2022-06-29 18:33       ` Sudeep Holla
2022-06-29 18:33         ` Sudeep Holla
2022-06-29 18:33         ` Sudeep Holla
2022-06-29 18:42       ` Sudeep Holla
2022-06-29 18:42         ` Sudeep Holla
2022-06-29 18:42         ` Sudeep Holla
2022-06-29 19:39         ` Conor.Dooley
2022-06-29 19:39           ` Conor.Dooley
2022-06-29 19:39           ` Conor.Dooley
2022-06-29 19:54           ` Sudeep Holla
2022-06-29 19:54             ` Sudeep Holla
2022-06-29 19:54             ` Sudeep Holla
2022-06-29 20:32             ` Conor.Dooley
2022-06-29 20:32               ` Conor.Dooley
2022-06-29 20:32               ` Conor.Dooley
2022-06-29 23:25               ` Conor.Dooley
2022-06-29 23:25                 ` Conor.Dooley
2022-06-29 23:25                 ` Conor.Dooley
2022-06-30 10:39                 ` Sudeep Holla
2022-06-30 10:39                   ` Sudeep Holla
2022-06-30 10:39                   ` Sudeep Holla
2022-06-30 16:37                   ` Conor.Dooley
2022-06-30 16:37                     ` Conor.Dooley
2022-06-30 16:37                     ` Conor.Dooley
2022-06-30 17:35                     ` Sudeep Holla
2022-06-30 17:35                       ` Sudeep Holla
2022-06-30 17:35                       ` Sudeep Holla
2022-06-30 19:20                       ` Conor.Dooley
2022-06-30 19:20                         ` Conor.Dooley
2022-06-30 19:20                         ` Conor.Dooley
2022-06-30 20:07                         ` Sudeep Holla [this message]
2022-06-30 20:07                           ` Sudeep Holla
2022-06-30 20:07                           ` Sudeep Holla
2022-06-30 20:13                           ` Conor.Dooley
2022-06-30 20:13                             ` Conor.Dooley
2022-06-30 20:13                             ` Conor.Dooley
2022-06-30 20:21                             ` Sudeep Holla
2022-06-30 20:21                               ` Sudeep Holla
2022-06-30 20:21                               ` Sudeep Holla
2022-06-30 22:07                               ` Conor.Dooley
2022-06-30 22:07                                 ` Conor.Dooley
2022-06-30 22:07                                 ` Conor.Dooley
2022-07-01 11:11                                 ` Sudeep Holla
2022-07-01 11:11                                   ` Sudeep Holla
2022-07-01 11:11                                   ` Sudeep Holla
2022-07-01 14:47                                   ` Conor.Dooley
2022-07-01 14:47                                     ` Conor.Dooley
2022-07-01 14:47                                     ` Conor.Dooley
2022-06-29 18:47       ` Sudeep Holla
2022-06-29 18:47         ` Sudeep Holla
2022-06-29 18:47         ` Sudeep Holla
2022-06-29 18:56         ` Conor.Dooley
2022-06-29 18:56           ` Conor.Dooley
2022-06-29 18:56           ` Conor.Dooley
2022-06-29 19:12           ` Sudeep Holla
2022-06-29 19:12             ` Sudeep Holla
2022-06-29 19:12             ` Sudeep Holla
2022-06-29 19:25             ` Conor.Dooley
2022-06-29 19:25               ` Conor.Dooley
2022-06-29 19:25               ` Conor.Dooley
2022-06-29 19:43               ` Sudeep Holla
2022-06-29 19:43                 ` Sudeep Holla
2022-06-29 19:43                 ` Sudeep Holla
2022-06-29 19:52                 ` Conor.Dooley
2022-06-29 19:52                   ` Conor.Dooley
2022-06-29 19:52                   ` Conor.Dooley
2022-06-29 18:29     ` Sudeep Holla
2022-06-29 18:29       ` Sudeep Holla
2022-06-29 18:29       ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 10/19] arm64: topology: Remove redundant setting of llc_id in CPU topology Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 11/19] arch_topology: Drop LLC identifier stash from the " Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 12/19] arch_topology: Set thread sibling cpumask only within the cluster Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 13/19] arch_topology: Check for non-negative value rather than -1 for IDs validity Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 14/19] arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 15/19] arch_topology: Don't set cluster identifier as physical package identifier Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 16/19] arch_topology: Limit span of cpu_clustergroup_mask() Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-28 10:28   ` Vincent Guittot
2022-06-28 10:28     ` Vincent Guittot
2022-06-28 10:28     ` Vincent Guittot
2022-06-27 16:50 ` [PATCH v5 17/19] arch_topology: Set cluster identifier in each core/thread from /cpu-map Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 18/19] arch_topology: Add support for parsing sockets in /cpu-map Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 19/19] arch_topology: Warn that topology for nested clusters is not supported Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-29 13:06 ` [PATCH] ACPI: Remove the unused find_acpi_cpu_cache_topology() Sudeep Holla
2022-06-29 13:06   ` Sudeep Holla
2022-06-29 13:06   ` Sudeep Holla
2022-06-29 13:50   ` Rafael J. Wysocki
2022-06-29 13:50     ` Rafael J. Wysocki
2022-06-29 13:50     ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220630200717.zlc6z6zcqbsw7euk@bogus \
    --to=sudeep.holla@arm.com \
    --cc=Conor.Dooley@microchip.com \
    --cc=Valentina.FernandezAlanis@microchip.com \
    --cc=atishp@atishpatra.org \
    --cc=atishp@rivosinc.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=gshan@redhat.com \
    --cc=ionela.voinescu@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=pierre.gondois@arm.com \
    --cc=rafael@kernel.org \
    --cc=robh+dt@kernel.org \
    --cc=vincent.guittot@linaro.org \
    --cc=wangqing@vivo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.