All of lore.kernel.org
 help / color / mirror / Atom feed
From: <Conor.Dooley@microchip.com>
To: <sudeep.holla@arm.com>
Cc: <linux-kernel@vger.kernel.org>, <gregkh@linuxfoundation.org>,
	<atishp@atishpatra.org>, <atishp@rivosinc.com>,
	<vincent.guittot@linaro.org>, <dietmar.eggemann@arm.com>,
	<wangqing@vivo.com>, <robh+dt@kernel.org>, <rafael@kernel.org>,
	<ionela.voinescu@arm.com>, <pierre.gondois@arm.com>,
	<linux-arm-kernel@lists.infradead.org>,
	<linux-riscv@lists.infradead.org>, <gshan@redhat.com>,
	<Valentina.FernandezAlanis@microchip.com>
Subject: Re: [PATCH v5 09/19] arch_topology: Use the last level cache information from the cacheinfo
Date: Thu, 30 Jun 2022 16:37:50 +0000	[thread overview]
Message-ID: <9d9e80b8-17e2-b1d9-14fa-f1d8d7dfbd9a@microchip.com> (raw)
In-Reply-To: <20220630103958.tcear5oz3orsqwg6@bogus>

On 30/06/2022 11:39, Sudeep Holla wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> 
> On Wed, Jun 29, 2022 at 11:25:41PM +0000, Conor.Dooley@microchip.com wrote:
>> On 29/06/2022 21:32, Conor.Dooley@microchip.com wrote:
>>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>>>
>>> On 29/06/2022 20:54, Sudeep Holla wrote:
>>>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>>>>
>>>> On Wed, Jun 29, 2022 at 07:39:43PM +0000, Conor.Dooley@microchip.com wrote:
>>>>> On 29/06/2022 19:42, Sudeep Holla wrote:
>>>>>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>>>>>>
>>>>>> On Wed, Jun 29, 2022 at 06:18:25PM +0000, Conor.Dooley@microchip.com wrote:
>>>>>>>
>>>>>>> No, no it doesn't. Not sure what I was thinking there.
>>>>>>> Prob tested that on the the last commit that bisect tested
>>>>>>> rather than the one it pointed out the problem was with.
>>>>>>>
>>>>>>> Either way, boot is broken in -next.
>>>>>>>
>>>>>>
>>>>>> Can you check if the below fixes the issue?
>>>>>
>>>>> Unfortunately, no joy.
>>>>> Applied to a HEAD of 3b23bb2573e6 ("arch_topology: Use the
>>>>> last level cache information from the cacheinfo").
>>>>
>>>> That's bad. Does the system boot with
>>>> Commit 2f7b757eb69d ("arch_topology: Add support to parse and detect cache
>>>> attributes") ?
>>>
>>> It does.
>>
> 
> I can't think of any reason for that to happen unless detect_cache_attributes
> is failing from init_cpu_topology and we are ignoring that.
> 
> Are all RISC-V platforms failing on -next or is it just this platform ?

I don't know. I only have SoCs with this core complex & one that does not
work with upstream. I can try my other board with this SoC - but I am on
leave at the moment w/ a computer or internet during the day so it may be
a few days before I can try it.

However, Niklas Cassel has tried to use the Canaan K210 on next-20220630
but had issues with RCU stalling:
https://lore.kernel.org/linux-riscv/Yr3PKR0Uj1bE5Y6O@x1-carbon/T/#m52016996fcf5fa0501066d73352ed8e806803e06
Not going to claim any relation, but that's minus 1 to the platforms that
can be used to test this on upstream RISC-V.

> We may have to try with some logs in detect_cache_attributes,
> last_level_cache_is_valid and last_level_cache_is_shared to check where it
> is going wrong.
> 
> It must be crashing in smp_callin->update_siblings_masks->last_level_cache_is_shared

Yeah, I was playing around last night for a while but didn't figure out the
root cause. I'll try again tonight.

In the meantime, would you mind taking the patches out of -next?
FWIW I repro'd the failure on next-20220630.

Thanks,
Conor.


WARNING: multiple messages have this Message-ID (diff)
From: <Conor.Dooley@microchip.com>
To: <sudeep.holla@arm.com>
Cc: <linux-kernel@vger.kernel.org>, <gregkh@linuxfoundation.org>,
	<atishp@atishpatra.org>, <atishp@rivosinc.com>,
	<vincent.guittot@linaro.org>, <dietmar.eggemann@arm.com>,
	<wangqing@vivo.com>, <robh+dt@kernel.org>, <rafael@kernel.org>,
	<ionela.voinescu@arm.com>, <pierre.gondois@arm.com>,
	<linux-arm-kernel@lists.infradead.org>,
	<linux-riscv@lists.infradead.org>, <gshan@redhat.com>,
	<Valentina.FernandezAlanis@microchip.com>
Subject: Re: [PATCH v5 09/19] arch_topology: Use the last level cache information from the cacheinfo
Date: Thu, 30 Jun 2022 16:37:50 +0000	[thread overview]
Message-ID: <9d9e80b8-17e2-b1d9-14fa-f1d8d7dfbd9a@microchip.com> (raw)
In-Reply-To: <20220630103958.tcear5oz3orsqwg6@bogus>

On 30/06/2022 11:39, Sudeep Holla wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> 
> On Wed, Jun 29, 2022 at 11:25:41PM +0000, Conor.Dooley@microchip.com wrote:
>> On 29/06/2022 21:32, Conor.Dooley@microchip.com wrote:
>>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>>>
>>> On 29/06/2022 20:54, Sudeep Holla wrote:
>>>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>>>>
>>>> On Wed, Jun 29, 2022 at 07:39:43PM +0000, Conor.Dooley@microchip.com wrote:
>>>>> On 29/06/2022 19:42, Sudeep Holla wrote:
>>>>>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>>>>>>
>>>>>> On Wed, Jun 29, 2022 at 06:18:25PM +0000, Conor.Dooley@microchip.com wrote:
>>>>>>>
>>>>>>> No, no it doesn't. Not sure what I was thinking there.
>>>>>>> Prob tested that on the the last commit that bisect tested
>>>>>>> rather than the one it pointed out the problem was with.
>>>>>>>
>>>>>>> Either way, boot is broken in -next.
>>>>>>>
>>>>>>
>>>>>> Can you check if the below fixes the issue?
>>>>>
>>>>> Unfortunately, no joy.
>>>>> Applied to a HEAD of 3b23bb2573e6 ("arch_topology: Use the
>>>>> last level cache information from the cacheinfo").
>>>>
>>>> That's bad. Does the system boot with
>>>> Commit 2f7b757eb69d ("arch_topology: Add support to parse and detect cache
>>>> attributes") ?
>>>
>>> It does.
>>
> 
> I can't think of any reason for that to happen unless detect_cache_attributes
> is failing from init_cpu_topology and we are ignoring that.
> 
> Are all RISC-V platforms failing on -next or is it just this platform ?

I don't know. I only have SoCs with this core complex & one that does not
work with upstream. I can try my other board with this SoC - but I am on
leave at the moment w/ a computer or internet during the day so it may be
a few days before I can try it.

However, Niklas Cassel has tried to use the Canaan K210 on next-20220630
but had issues with RCU stalling:
https://lore.kernel.org/linux-riscv/Yr3PKR0Uj1bE5Y6O@x1-carbon/T/#m52016996fcf5fa0501066d73352ed8e806803e06
Not going to claim any relation, but that's minus 1 to the platforms that
can be used to test this on upstream RISC-V.

> We may have to try with some logs in detect_cache_attributes,
> last_level_cache_is_valid and last_level_cache_is_shared to check where it
> is going wrong.
> 
> It must be crashing in smp_callin->update_siblings_masks->last_level_cache_is_shared

Yeah, I was playing around last night for a while but didn't figure out the
root cause. I'll try again tonight.

In the meantime, would you mind taking the patches out of -next?
FWIW I repro'd the failure on next-20220630.

Thanks,
Conor.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

WARNING: multiple messages have this Message-ID (diff)
From: <Conor.Dooley@microchip.com>
To: <sudeep.holla@arm.com>
Cc: <linux-kernel@vger.kernel.org>, <gregkh@linuxfoundation.org>,
	<atishp@atishpatra.org>, <atishp@rivosinc.com>,
	<vincent.guittot@linaro.org>, <dietmar.eggemann@arm.com>,
	<wangqing@vivo.com>, <robh+dt@kernel.org>, <rafael@kernel.org>,
	<ionela.voinescu@arm.com>, <pierre.gondois@arm.com>,
	<linux-arm-kernel@lists.infradead.org>,
	<linux-riscv@lists.infradead.org>, <gshan@redhat.com>,
	<Valentina.FernandezAlanis@microchip.com>
Subject: Re: [PATCH v5 09/19] arch_topology: Use the last level cache information from the cacheinfo
Date: Thu, 30 Jun 2022 16:37:50 +0000	[thread overview]
Message-ID: <9d9e80b8-17e2-b1d9-14fa-f1d8d7dfbd9a@microchip.com> (raw)
In-Reply-To: <20220630103958.tcear5oz3orsqwg6@bogus>

On 30/06/2022 11:39, Sudeep Holla wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> 
> On Wed, Jun 29, 2022 at 11:25:41PM +0000, Conor.Dooley@microchip.com wrote:
>> On 29/06/2022 21:32, Conor.Dooley@microchip.com wrote:
>>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>>>
>>> On 29/06/2022 20:54, Sudeep Holla wrote:
>>>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>>>>
>>>> On Wed, Jun 29, 2022 at 07:39:43PM +0000, Conor.Dooley@microchip.com wrote:
>>>>> On 29/06/2022 19:42, Sudeep Holla wrote:
>>>>>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>>>>>>
>>>>>> On Wed, Jun 29, 2022 at 06:18:25PM +0000, Conor.Dooley@microchip.com wrote:
>>>>>>>
>>>>>>> No, no it doesn't. Not sure what I was thinking there.
>>>>>>> Prob tested that on the the last commit that bisect tested
>>>>>>> rather than the one it pointed out the problem was with.
>>>>>>>
>>>>>>> Either way, boot is broken in -next.
>>>>>>>
>>>>>>
>>>>>> Can you check if the below fixes the issue?
>>>>>
>>>>> Unfortunately, no joy.
>>>>> Applied to a HEAD of 3b23bb2573e6 ("arch_topology: Use the
>>>>> last level cache information from the cacheinfo").
>>>>
>>>> That's bad. Does the system boot with
>>>> Commit 2f7b757eb69d ("arch_topology: Add support to parse and detect cache
>>>> attributes") ?
>>>
>>> It does.
>>
> 
> I can't think of any reason for that to happen unless detect_cache_attributes
> is failing from init_cpu_topology and we are ignoring that.
> 
> Are all RISC-V platforms failing on -next or is it just this platform ?

I don't know. I only have SoCs with this core complex & one that does not
work with upstream. I can try my other board with this SoC - but I am on
leave at the moment w/ a computer or internet during the day so it may be
a few days before I can try it.

However, Niklas Cassel has tried to use the Canaan K210 on next-20220630
but had issues with RCU stalling:
https://lore.kernel.org/linux-riscv/Yr3PKR0Uj1bE5Y6O@x1-carbon/T/#m52016996fcf5fa0501066d73352ed8e806803e06
Not going to claim any relation, but that's minus 1 to the platforms that
can be used to test this on upstream RISC-V.

> We may have to try with some logs in detect_cache_attributes,
> last_level_cache_is_valid and last_level_cache_is_shared to check where it
> is going wrong.
> 
> It must be crashing in smp_callin->update_siblings_masks->last_level_cache_is_shared

Yeah, I was playing around last night for a while but didn't figure out the
root cause. I'll try again tonight.

In the meantime, would you mind taking the patches out of -next?
FWIW I repro'd the failure on next-20220630.

Thanks,
Conor.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2022-06-30 16:37 UTC|newest]

Thread overview: 144+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-27 16:50 [PATCH v5 00/19] arch_topology: Updates to add socket support and fix cluster ids Sudeep Holla
2022-06-27 16:50 ` Sudeep Holla
2022-06-27 16:50 ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 01/19] ACPI: PPTT: Use table offset as fw_token instead of virtual address Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 02/19] cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 03/19] cacheinfo: Add helper to access any cache index for a given CPU Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 04/19] cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 05/19] cacheinfo: Add support to check if last level cache(LLC) is valid or shared Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 06/19] cacheinfo: Allow early detection and population of cache attributes Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 07/19] cacheinfo: Use cache identifiers to check if the caches are shared if available Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 08/19] arch_topology: Add support to parse and detect cache attributes Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 09/19] arch_topology: Use the last level cache information from the cacheinfo Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-29 17:49   ` Conor.Dooley
2022-06-29 17:49     ` Conor.Dooley
2022-06-29 17:49     ` Conor.Dooley
2022-06-29 18:18     ` Conor.Dooley
2022-06-29 18:18       ` Conor.Dooley
2022-06-29 18:18       ` Conor.Dooley
2022-06-29 18:33       ` Sudeep Holla
2022-06-29 18:33         ` Sudeep Holla
2022-06-29 18:33         ` Sudeep Holla
2022-06-29 18:42       ` Sudeep Holla
2022-06-29 18:42         ` Sudeep Holla
2022-06-29 18:42         ` Sudeep Holla
2022-06-29 19:39         ` Conor.Dooley
2022-06-29 19:39           ` Conor.Dooley
2022-06-29 19:39           ` Conor.Dooley
2022-06-29 19:54           ` Sudeep Holla
2022-06-29 19:54             ` Sudeep Holla
2022-06-29 19:54             ` Sudeep Holla
2022-06-29 20:32             ` Conor.Dooley
2022-06-29 20:32               ` Conor.Dooley
2022-06-29 20:32               ` Conor.Dooley
2022-06-29 23:25               ` Conor.Dooley
2022-06-29 23:25                 ` Conor.Dooley
2022-06-29 23:25                 ` Conor.Dooley
2022-06-30 10:39                 ` Sudeep Holla
2022-06-30 10:39                   ` Sudeep Holla
2022-06-30 10:39                   ` Sudeep Holla
2022-06-30 16:37                   ` Conor.Dooley [this message]
2022-06-30 16:37                     ` Conor.Dooley
2022-06-30 16:37                     ` Conor.Dooley
2022-06-30 17:35                     ` Sudeep Holla
2022-06-30 17:35                       ` Sudeep Holla
2022-06-30 17:35                       ` Sudeep Holla
2022-06-30 19:20                       ` Conor.Dooley
2022-06-30 19:20                         ` Conor.Dooley
2022-06-30 19:20                         ` Conor.Dooley
2022-06-30 20:07                         ` Sudeep Holla
2022-06-30 20:07                           ` Sudeep Holla
2022-06-30 20:07                           ` Sudeep Holla
2022-06-30 20:13                           ` Conor.Dooley
2022-06-30 20:13                             ` Conor.Dooley
2022-06-30 20:13                             ` Conor.Dooley
2022-06-30 20:21                             ` Sudeep Holla
2022-06-30 20:21                               ` Sudeep Holla
2022-06-30 20:21                               ` Sudeep Holla
2022-06-30 22:07                               ` Conor.Dooley
2022-06-30 22:07                                 ` Conor.Dooley
2022-06-30 22:07                                 ` Conor.Dooley
2022-07-01 11:11                                 ` Sudeep Holla
2022-07-01 11:11                                   ` Sudeep Holla
2022-07-01 11:11                                   ` Sudeep Holla
2022-07-01 14:47                                   ` Conor.Dooley
2022-07-01 14:47                                     ` Conor.Dooley
2022-07-01 14:47                                     ` Conor.Dooley
2022-06-29 18:47       ` Sudeep Holla
2022-06-29 18:47         ` Sudeep Holla
2022-06-29 18:47         ` Sudeep Holla
2022-06-29 18:56         ` Conor.Dooley
2022-06-29 18:56           ` Conor.Dooley
2022-06-29 18:56           ` Conor.Dooley
2022-06-29 19:12           ` Sudeep Holla
2022-06-29 19:12             ` Sudeep Holla
2022-06-29 19:12             ` Sudeep Holla
2022-06-29 19:25             ` Conor.Dooley
2022-06-29 19:25               ` Conor.Dooley
2022-06-29 19:25               ` Conor.Dooley
2022-06-29 19:43               ` Sudeep Holla
2022-06-29 19:43                 ` Sudeep Holla
2022-06-29 19:43                 ` Sudeep Holla
2022-06-29 19:52                 ` Conor.Dooley
2022-06-29 19:52                   ` Conor.Dooley
2022-06-29 19:52                   ` Conor.Dooley
2022-06-29 18:29     ` Sudeep Holla
2022-06-29 18:29       ` Sudeep Holla
2022-06-29 18:29       ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 10/19] arm64: topology: Remove redundant setting of llc_id in CPU topology Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 11/19] arch_topology: Drop LLC identifier stash from the " Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 12/19] arch_topology: Set thread sibling cpumask only within the cluster Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 13/19] arch_topology: Check for non-negative value rather than -1 for IDs validity Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 14/19] arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 15/19] arch_topology: Don't set cluster identifier as physical package identifier Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 16/19] arch_topology: Limit span of cpu_clustergroup_mask() Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-28 10:28   ` Vincent Guittot
2022-06-28 10:28     ` Vincent Guittot
2022-06-28 10:28     ` Vincent Guittot
2022-06-27 16:50 ` [PATCH v5 17/19] arch_topology: Set cluster identifier in each core/thread from /cpu-map Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 18/19] arch_topology: Add support for parsing sockets in /cpu-map Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50 ` [PATCH v5 19/19] arch_topology: Warn that topology for nested clusters is not supported Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-27 16:50   ` Sudeep Holla
2022-06-29 13:06 ` [PATCH] ACPI: Remove the unused find_acpi_cpu_cache_topology() Sudeep Holla
2022-06-29 13:06   ` Sudeep Holla
2022-06-29 13:06   ` Sudeep Holla
2022-06-29 13:50   ` Rafael J. Wysocki
2022-06-29 13:50     ` Rafael J. Wysocki
2022-06-29 13:50     ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9d9e80b8-17e2-b1d9-14fa-f1d8d7dfbd9a@microchip.com \
    --to=conor.dooley@microchip.com \
    --cc=Valentina.FernandezAlanis@microchip.com \
    --cc=atishp@atishpatra.org \
    --cc=atishp@rivosinc.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=gshan@redhat.com \
    --cc=ionela.voinescu@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=pierre.gondois@arm.com \
    --cc=rafael@kernel.org \
    --cc=robh+dt@kernel.org \
    --cc=sudeep.holla@arm.com \
    --cc=vincent.guittot@linaro.org \
    --cc=wangqing@vivo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.