linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] ACPI/PPTT: Handle architecturally unknown cache types
@ 2018-09-11 19:32 Jeffrey Hugo
  2018-09-11 20:16 ` Jeremy Linton
  0 siblings, 1 reply; 22+ messages in thread
From: Jeffrey Hugo @ 2018-09-11 19:32 UTC (permalink / raw)
  To: rjw, linux-acpi, jeremy.linton; +Cc: linux-kernel, vkilari, Jeffrey Hugo

The type of a cache might not be specified by architectural mechanisms (ie
system registers), but its type might be specified in the PPTT.  In this
case, following the PPTT specification, we should identify the cache as
the type specified by PPTT.

This fixes the following lscpu issue where only the cache type sysfs file
is missing which results in no output providing a poor user experience in
the above system configuration-
lscpu: cannot open /sys/devices/system/cpu/cpu0/cache/index3/type: No such
file or directory

Fixes: 2bd00bcd73e5 (ACPI/PPTT: Add Processor Properties Topology Table parsing)
Reported-by: Vijaya Kumar K <vkilari@codeaurora.org>
Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org>
---
 drivers/acpi/pptt.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
index d1e26cb..3c6db09 100644
--- a/drivers/acpi/pptt.c
+++ b/drivers/acpi/pptt.c
@@ -401,6 +401,21 @@ static void update_cache_properties(struct cacheinfo *this_leaf,
 			break;
 		}
 	}
+	if ((this_leaf->type == CACHE_TYPE_NOCACHE) &&
+	    (found_cache->flags & ACPI_PPTT_CACHE_TYPE_VALID)) {
+		switch (found_cache->attributes & ACPI_PPTT_MASK_CACHE_TYPE) {
+		case ACPI_PPTT_CACHE_TYPE_DATA:
+			this_leaf->type = CACHE_TYPE_DATA;
+			break;
+		case ACPI_PPTT_CACHE_TYPE_INSTR:
+			this_leaf->type = CACHE_TYPE_INST;
+			break;
+		case ACPI_PPTT_CACHE_TYPE_UNIFIED:
+		case ACPI_PPTT_CACHE_TYPE_UNIFIED_ALT:
+			this_leaf->type = CACHE_TYPE_UNIFIED;
+			break;
+		}
+	}
 	/*
 	 * If the above flags are valid, and the cache type is NOCACHE
 	 * update the cache type as well.
-- 
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH] ACPI/PPTT: Handle architecturally unknown cache types
  2018-09-11 19:32 [PATCH] ACPI/PPTT: Handle architecturally unknown cache types Jeffrey Hugo
@ 2018-09-11 20:16 ` Jeremy Linton
  2018-09-11 20:38   ` Jeffrey Hugo
  2018-09-12 10:37   ` Sudeep Holla
  0 siblings, 2 replies; 22+ messages in thread
From: Jeremy Linton @ 2018-09-11 20:16 UTC (permalink / raw)
  To: Jeffrey Hugo, rjw, linux-acpi; +Cc: linux-kernel, vkilari, Sudeep Holla

Hi Jeffrey,

(+Sudeep)

On 09/11/2018 02:32 PM, Jeffrey Hugo wrote:
> The type of a cache might not be specified by architectural mechanisms (ie
> system registers), but its type might be specified in the PPTT.  In this
> case, following the PPTT specification, we should identify the cache as
> the type specified by PPTT.
> 
> This fixes the following lscpu issue where only the cache type sysfs file
> is missing which results in no output providing a poor user experience in
> the above system configuration-
> lscpu: cannot open /sys/devices/system/cpu/cpu0/cache/index3/type: No such
> file or directory
> 
> Fixes: 2bd00bcd73e5 (ACPI/PPTT: Add Processor Properties Topology Table parsing)
> Reported-by: Vijaya Kumar K <vkilari@codeaurora.org>
> Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org>
> ---
>   drivers/acpi/pptt.c | 15 +++++++++++++++
>   1 file changed, 15 insertions(+)
> 
> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
> index d1e26cb..3c6db09 100644
> --- a/drivers/acpi/pptt.c
> +++ b/drivers/acpi/pptt.c
> @@ -401,6 +401,21 @@ static void update_cache_properties(struct cacheinfo *this_leaf,
>   			break;
>   		}
>   	}
> +	if ((this_leaf->type == CACHE_TYPE_NOCACHE) &&
> +	    (found_cache->flags & ACPI_PPTT_CACHE_TYPE_VALID)) {
> +		switch (found_cache->attributes & ACPI_PPTT_MASK_CACHE_TYPE) {
> +		case ACPI_PPTT_CACHE_TYPE_DATA:
> +			this_leaf->type = CACHE_TYPE_DATA;
> +			break;
> +		case ACPI_PPTT_CACHE_TYPE_INSTR:
> +			this_leaf->type = CACHE_TYPE_INST;
> +			break;
> +		case ACPI_PPTT_CACHE_TYPE_UNIFIED:
> +		case ACPI_PPTT_CACHE_TYPE_UNIFIED_ALT:
> +			this_leaf->type = CACHE_TYPE_UNIFIED;
> +			break;
> +		}
> +	}
>   	/*
>   	 * If the above flags are valid, and the cache type is NOCACHE
>   	 * update the cache type as well.
> 

If you look at the next line of code following this comment its going to 
update the cache type for fully populated PPTT nodes. Although with the 
suggested change its only going to activate if someone completely fills 
out the node and fails to set the valid flag on the cache type.

What I suspect is happening in the reported case is that the nodes in 
the PPTT table are missing fields we consider to be important. Since 
that data isn't being filled out anywhere else, so we leave the cache 
type alone too. This has the effect of hiding sysfs nodes with 
incomplete information.

Also, the lack of the DATA/INST fields is based on the assumption that 
the only nodes which need their type field updated are outside of the 
CPU core itself so they are pretty much guaranteed to be UNIFIED. Are 
you hitting this case?


Thanks,

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] ACPI/PPTT: Handle architecturally unknown cache types
  2018-09-11 20:16 ` Jeremy Linton
@ 2018-09-11 20:38   ` Jeffrey Hugo
  2018-09-11 21:25     ` Jeremy Linton
  2018-09-12 10:49     ` Sudeep Holla
  2018-09-12 10:37   ` Sudeep Holla
  1 sibling, 2 replies; 22+ messages in thread
From: Jeffrey Hugo @ 2018-09-11 20:38 UTC (permalink / raw)
  To: Jeremy Linton, rjw, linux-acpi; +Cc: linux-kernel, vkilari, Sudeep Holla

On 9/11/2018 2:16 PM, Jeremy Linton wrote:
> Hi Jeffrey,
> 
> (+Sudeep)
> 
> On 09/11/2018 02:32 PM, Jeffrey Hugo wrote:
>> The type of a cache might not be specified by architectural mechanisms 
>> (ie
>> system registers), but its type might be specified in the PPTT.  In this
>> case, following the PPTT specification, we should identify the cache as
>> the type specified by PPTT.
>>
>> This fixes the following lscpu issue where only the cache type sysfs file
>> is missing which results in no output providing a poor user experience in
>> the above system configuration-
>> lscpu: cannot open /sys/devices/system/cpu/cpu0/cache/index3/type: No 
>> such
>> file or directory
>>
>> Fixes: 2bd00bcd73e5 (ACPI/PPTT: Add Processor Properties Topology 
>> Table parsing)
>> Reported-by: Vijaya Kumar K <vkilari@codeaurora.org>
>> Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org>
>> ---
>>   drivers/acpi/pptt.c | 15 +++++++++++++++
>>   1 file changed, 15 insertions(+)
>>
>> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
>> index d1e26cb..3c6db09 100644
>> --- a/drivers/acpi/pptt.c
>> +++ b/drivers/acpi/pptt.c
>> @@ -401,6 +401,21 @@ static void update_cache_properties(struct 
>> cacheinfo *this_leaf,
>>               break;
>>           }
>>       }
>> +    if ((this_leaf->type == CACHE_TYPE_NOCACHE) &&
>> +        (found_cache->flags & ACPI_PPTT_CACHE_TYPE_VALID)) {
>> +        switch (found_cache->attributes & ACPI_PPTT_MASK_CACHE_TYPE) {
>> +        case ACPI_PPTT_CACHE_TYPE_DATA:
>> +            this_leaf->type = CACHE_TYPE_DATA;
>> +            break;
>> +        case ACPI_PPTT_CACHE_TYPE_INSTR:
>> +            this_leaf->type = CACHE_TYPE_INST;
>> +            break;
>> +        case ACPI_PPTT_CACHE_TYPE_UNIFIED:
>> +        case ACPI_PPTT_CACHE_TYPE_UNIFIED_ALT:
>> +            this_leaf->type = CACHE_TYPE_UNIFIED;
>> +            break;
>> +        }
>> +    }
>>       /*
>>        * If the above flags are valid, and the cache type is NOCACHE
>>        * update the cache type as well.
>>
> 
> If you look at the next line of code following this comment its going to 
> update the cache type for fully populated PPTT nodes. Although with the 
> suggested change its only going to activate if someone completely fills 
> out the node and fails to set the valid flag on the cache type. 

Yes, however that case doesn't apply to the scenario we are concerned 
about, doesn't seem to be fully following the PPTT spec, and seems odd 
that Linux just assumes that a "fully specified" cache is unified.

> What I suspect is happening in the reported case is that the nodes in 
> the PPTT table are missing fields we consider to be important. Since 
> that data isn't being filled out anywhere else, so we leave the cache 
> type alone too. This has the effect of hiding sysfs nodes with 
> incomplete information.
> 
> Also, the lack of the DATA/INST fields is based on the assumption that 
> the only nodes which need their type field updated are outside of the 
> CPU core itself so they are pretty much guaranteed to be UNIFIED. Are 
> you hitting this case?
> 

Yes.  Without this change, we hit the lscpu error in the commit message, 
and get zero output about the system.  We don't even get information 
about the caches which are architecturally specified or how many cpus 
are present.  With this change, we get what we expect out of lscpu (and 
also lstopo) including the cache(s) which are not architecturally specified.

I guess I still don't understand why its important for PPTT to list, for 
example, the sets/ways of a cache in all scenarios.  In the case of a 
"transparent" cache (implementation defined as not reported per section 
D3.4.2 of the ARM ARM where the cache cannot be managed by SW), there 
may not be valid values for sets/ways.  I would argue its better to not 
report that information, rather than provide bogus information just to 
make Linux happy, which may break other OSes and provide means for which 
a user to hang themselves.

However, in the case of a transparent cache, the size/type/level/write 
policy/etc (whatever the firmware provider deems relevant) might be 
valuable information for the OS to make scheduling decisions, and is 
certainly valuable for user space utilities for cross-machine/data 
center level job scheduling.

-- 
Jeffrey Hugo
Qualcomm Datacenter Technologies as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] ACPI/PPTT: Handle architecturally unknown cache types
  2018-09-11 20:38   ` Jeffrey Hugo
@ 2018-09-11 21:25     ` Jeremy Linton
  2018-09-12 14:41       ` Jeffrey Hugo
  2018-09-12 10:49     ` Sudeep Holla
  1 sibling, 1 reply; 22+ messages in thread
From: Jeremy Linton @ 2018-09-11 21:25 UTC (permalink / raw)
  To: Jeffrey Hugo, rjw, linux-acpi; +Cc: linux-kernel, vkilari, Sudeep Holla

Hi,

On 09/11/2018 03:38 PM, Jeffrey Hugo wrote:
> On 9/11/2018 2:16 PM, Jeremy Linton wrote:
>> Hi Jeffrey,
>>
>> (+Sudeep)
>>
>> On 09/11/2018 02:32 PM, Jeffrey Hugo wrote:
>>> The type of a cache might not be specified by architectural 
>>> mechanisms (ie
>>> system registers), but its type might be specified in the PPTT.  In this
>>> case, following the PPTT specification, we should identify the cache as
>>> the type specified by PPTT.
>>>
>>> This fixes the following lscpu issue where only the cache type sysfs 
>>> file
>>> is missing which results in no output providing a poor user 
>>> experience in
>>> the above system configuration-
>>> lscpu: cannot open /sys/devices/system/cpu/cpu0/cache/index3/type: No 
>>> such
>>> file or directory
>>>
>>> Fixes: 2bd00bcd73e5 (ACPI/PPTT: Add Processor Properties Topology 
>>> Table parsing)
>>> Reported-by: Vijaya Kumar K <vkilari@codeaurora.org>
>>> Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org>
>>> ---
>>>   drivers/acpi/pptt.c | 15 +++++++++++++++
>>>   1 file changed, 15 insertions(+)
>>>
>>> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
>>> index d1e26cb..3c6db09 100644
>>> --- a/drivers/acpi/pptt.c
>>> +++ b/drivers/acpi/pptt.c
>>> @@ -401,6 +401,21 @@ static void update_cache_properties(struct 
>>> cacheinfo *this_leaf,
>>>               break;
>>>           }
>>>       }
>>> +    if ((this_leaf->type == CACHE_TYPE_NOCACHE) &&
>>> +        (found_cache->flags & ACPI_PPTT_CACHE_TYPE_VALID)) {
>>> +        switch (found_cache->attributes & ACPI_PPTT_MASK_CACHE_TYPE) {
>>> +        case ACPI_PPTT_CACHE_TYPE_DATA:
>>> +            this_leaf->type = CACHE_TYPE_DATA;
>>> +            break;
>>> +        case ACPI_PPTT_CACHE_TYPE_INSTR:
>>> +            this_leaf->type = CACHE_TYPE_INST;
>>> +            break;
>>> +        case ACPI_PPTT_CACHE_TYPE_UNIFIED:
>>> +        case ACPI_PPTT_CACHE_TYPE_UNIFIED_ALT:
>>> +            this_leaf->type = CACHE_TYPE_UNIFIED;
>>> +            break;
>>> +        }
>>> +    }
>>>       /*
>>>        * If the above flags are valid, and the cache type is NOCACHE
>>>        * update the cache type as well.
>>>
>>
>> If you look at the next line of code following this comment its going 
>> to update the cache type for fully populated PPTT nodes. Although with 
>> the suggested change its only going to activate if someone completely 
>> fills out the node and fails to set the valid flag on the cache type. 
> 
> Yes, however that case doesn't apply to the scenario we are concerned 
> about, doesn't seem to be fully following the PPTT spec, and seems odd 
> that Linux just assumes that a "fully specified" cache is unified.

Because, the architecturally specified ones won't be type NOCACHE?

> 
>> What I suspect is happening in the reported case is that the nodes in 
>> the PPTT table are missing fields we consider to be important. Since 
>> that data isn't being filled out anywhere else, so we leave the cache 
>> type alone too. This has the effect of hiding sysfs nodes with 
>> incomplete information.
>>
>> Also, the lack of the DATA/INST fields is based on the assumption that 
>> the only nodes which need their type field updated are outside of the 
>> CPU core itself so they are pretty much guaranteed to be UNIFIED. Are 
>> you hitting this case?
>>
> 
> Yes.  Without this change, we hit the lscpu error in the commit message, 
> and get zero output about the system.  We don't even get information 
> about the caches which are architecturally specified or how many cpus 
> are present.  With this change, we get what we expect out of lscpu (and 
> also lstopo) including the cache(s) which are not architecturally 
> specified.

I'm a bit surprised this changes the behavior of the architecturally 
specified ones. As I mentioned above, those shouldn't be NOCACHE. We use 
the level/type as a key for matching a PPTT node to an architecturally 
described cache. If that is mismatched, something more fundamental is 
happening. About the only case I can think of that can cause this is if 
the CLIDR type fields are incorrect.

In which case I might suggest we move your switch() for the INST/DATA 
into the check below that comment.

> 
> I guess I still don't understand why its important for PPTT to list, for 
> example, the sets/ways of a cache in all scenarios.  In the case of a 
> "transparent" cache (implementation defined as not reported per section 
> D3.4.2 of the ARM ARM where the cache cannot be managed by SW), there 
> may not be valid values for sets/ways.  I would argue its better to not 
> report that information, rather than provide bogus information just to 
> make Linux happy, which may break other OSes and provide means for which 
> a user to hang themselves.

This doesn't really have anything to do with the Arm/ARM's definition of 
set/way as it pertains to cache maint. Its more to assist userspace 
software with an understanding of the system cache architecture so it 
can make intelligent decisions about loop tiling and the like.

What we want is a consistent dependable view for software to utilize. An 
unreliable sysfs field is one that tends not to get used.

> 
> However, in the case of a transparent cache, the size/type/level/write 
> policy/etc (whatever the firmware provider deems relevant) might be 
> valuable information for the OS to make scheduling decisions, and is 
> certainly valuable for user space utilities for cross-machine/data ou
> center level job scheduling.
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] ACPI/PPTT: Handle architecturally unknown cache types
  2018-09-11 20:16 ` Jeremy Linton
  2018-09-11 20:38   ` Jeffrey Hugo
@ 2018-09-12 10:37   ` Sudeep Holla
  1 sibling, 0 replies; 22+ messages in thread
From: Sudeep Holla @ 2018-09-12 10:37 UTC (permalink / raw)
  To: Jeremy Linton, Jeffrey Hugo, rjw, linux-acpi
  Cc: Sudeep Holla, linux-kernel, vkilari



On 11/09/18 21:16, Jeremy Linton wrote:
> Hi Jeffrey,
> 
> (+Sudeep)
> 

Thanks for looping me in.

> 
> If you look at the next line of code following this comment its going to
> update the cache type for fully populated PPTT nodes. Although with the
> suggested change its only going to activate if someone completely fills
> out the node and fails to set the valid flag on the cache type.
> 
> What I suspect is happening in the reported case is that the nodes in
> the PPTT table are missing fields we consider to be important. Since
> that data isn't being filled out anywhere else, so we leave the cache
> type alone too. This has the effect of hiding sysfs nodes with
> incomplete information.
> 
> Also, the lack of the DATA/INST fields is based on the assumption that
> the only nodes which need their type field updated are outside of the
> CPU core itself so they are pretty much guaranteed to be UNIFIED. Are
> you hitting this case?
> 

Completely agree with you.

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] ACPI/PPTT: Handle architecturally unknown cache types
  2018-09-11 20:38   ` Jeffrey Hugo
  2018-09-11 21:25     ` Jeremy Linton
@ 2018-09-12 10:49     ` Sudeep Holla
  2018-09-12 14:48       ` Jeffrey Hugo
  2018-09-13  5:51       ` Brice Goglin
  1 sibling, 2 replies; 22+ messages in thread
From: Sudeep Holla @ 2018-09-12 10:49 UTC (permalink / raw)
  To: Jeffrey Hugo, Jeremy Linton, rjw, linux-acpi
  Cc: Sudeep Holla, linux-kernel, vkilari



On 11/09/18 21:38, Jeffrey Hugo wrote:
> On 9/11/2018 2:16 PM, Jeremy Linton wrote:
>> Hi Jeffrey,
>>
>> (+Sudeep)
>>

[..]

>>
>> If you look at the next line of code following this comment its going
>> to update the cache type for fully populated PPTT nodes. Although with
>> the suggested change its only going to activate if someone completely
>> fills out the node and fails to set the valid flag on the cache type. 
> 
> Yes, however that case doesn't apply to the scenario we are concerned
> about, doesn't seem to be fully following the PPTT spec, and seems odd
> that Linux just assumes that a "fully specified" cache is unified.
> 

Agreed, but adding code that will never get used is also not so good.
Do you have systems where this is needed ?

>> What I suspect is happening in the reported case is that the nodes in
>> the PPTT table are missing fields we consider to be important. Since
>> that data isn't being filled out anywhere else, so we leave the cache
>> type alone too. This has the effect of hiding sysfs nodes with
>> incomplete information.
>>
>> Also, the lack of the DATA/INST fields is based on the assumption that
>> the only nodes which need their type field updated are outside of the
>> CPU core itself so they are pretty much guaranteed to be UNIFIED. Are
>> you hitting this case?
>>
> 
> Yes.  Without this change, we hit the lscpu error in the commit message,
> and get zero output about the system.  We don't even get information
> about the caches which are architecturally specified or how many cpus
> are present.  With this change, we get what we expect out of lscpu (and
> also lstopo) including the cache(s) which are not architecturally
> specified.
> 

lscpu and lstopo are so broken. They just assume everything on CPU0.
If you hotplug them out, you start seeing issues. So reading and file
that doesn't exist and then bail out on other essential info though they
are present, hmmm ...

> I guess I still don't understand why its important for PPTT to list, for
> example, the sets/ways of a cache in all scenarios.  In the case of a
> "transparent" cache (implementation defined as not reported per section
> D3.4.2 of the ARM ARM where the cache cannot be managed by SW), there
> may not be valid values for sets/ways.  I would argue its better to not
> report that information, rather than provide bogus information just to
> make Linux happy, which may break other OSes and provide means for which
> a user to hang themselves.
> 

While I agree presenting wrong info is not good, but one of the reasons
the cache info is present is to provide those info for some applications
to do optimization based on that(I am told so and not sure if just type
and size will be good enough) and you seem to agree with that below.

> However, in the case of a transparent cache, the size/type/level/write
> policy/etc (whatever the firmware provider deems relevant) might be
> valuable information for the OS to make scheduling decisions, and is
> certainly valuable for user space utilities for cross-machine/data
> center level job scheduling.
> 

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] ACPI/PPTT: Handle architecturally unknown cache types
  2018-09-11 21:25     ` Jeremy Linton
@ 2018-09-12 14:41       ` Jeffrey Hugo
  2018-09-12 15:27         ` Sudeep Holla
  2018-09-12 15:39         ` Jeremy Linton
  0 siblings, 2 replies; 22+ messages in thread
From: Jeffrey Hugo @ 2018-09-12 14:41 UTC (permalink / raw)
  To: Jeremy Linton, rjw, linux-acpi; +Cc: linux-kernel, vkilari, Sudeep Holla

On 9/11/2018 3:25 PM, Jeremy Linton wrote:
> Hi,
> 
> On 09/11/2018 03:38 PM, Jeffrey Hugo wrote:
>> On 9/11/2018 2:16 PM, Jeremy Linton wrote:
>>> Hi Jeffrey,
>>>
>>> (+Sudeep)
>>>
>>> On 09/11/2018 02:32 PM, Jeffrey Hugo wrote:
>>>> The type of a cache might not be specified by architectural 
>>>> mechanisms (ie
>>>> system registers), but its type might be specified in the PPTT.  In 
>>>> this
>>>> case, following the PPTT specification, we should identify the cache as
>>>> the type specified by PPTT.
>>>>
>>>> This fixes the following lscpu issue where only the cache type sysfs 
>>>> file
>>>> is missing which results in no output providing a poor user 
>>>> experience in
>>>> the above system configuration-
>>>> lscpu: cannot open /sys/devices/system/cpu/cpu0/cache/index3/type: 
>>>> No such
>>>> file or directory
>>>>
>>>> Fixes: 2bd00bcd73e5 (ACPI/PPTT: Add Processor Properties Topology 
>>>> Table parsing)
>>>> Reported-by: Vijaya Kumar K <vkilari@codeaurora.org>
>>>> Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org>
>>>> ---
>>>>   drivers/acpi/pptt.c | 15 +++++++++++++++
>>>>   1 file changed, 15 insertions(+)
>>>>
>>>> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
>>>> index d1e26cb..3c6db09 100644
>>>> --- a/drivers/acpi/pptt.c
>>>> +++ b/drivers/acpi/pptt.c
>>>> @@ -401,6 +401,21 @@ static void update_cache_properties(struct 
>>>> cacheinfo *this_leaf,
>>>>               break;
>>>>           }
>>>>       }
>>>> +    if ((this_leaf->type == CACHE_TYPE_NOCACHE) &&
>>>> +        (found_cache->flags & ACPI_PPTT_CACHE_TYPE_VALID)) {
>>>> +        switch (found_cache->attributes & ACPI_PPTT_MASK_CACHE_TYPE) {
>>>> +        case ACPI_PPTT_CACHE_TYPE_DATA:
>>>> +            this_leaf->type = CACHE_TYPE_DATA;
>>>> +            break;
>>>> +        case ACPI_PPTT_CACHE_TYPE_INSTR:
>>>> +            this_leaf->type = CACHE_TYPE_INST;
>>>> +            break;
>>>> +        case ACPI_PPTT_CACHE_TYPE_UNIFIED:
>>>> +        case ACPI_PPTT_CACHE_TYPE_UNIFIED_ALT:
>>>> +            this_leaf->type = CACHE_TYPE_UNIFIED;
>>>> +            break;
>>>> +        }
>>>> +    }
>>>>       /*
>>>>        * If the above flags are valid, and the cache type is NOCACHE
>>>>        * update the cache type as well.
>>>>
>>>
>>> If you look at the next line of code following this comment its going 
>>> to update the cache type for fully populated PPTT nodes. Although 
>>> with the suggested change its only going to activate if someone 
>>> completely fills out the node and fails to set the valid flag on the 
>>> cache type. 
>>
>> Yes, however that case doesn't apply to the scenario we are concerned 
>> about, doesn't seem to be fully following the PPTT spec, and seems odd 
>> that Linux just assumes that a "fully specified" cache is unified.
> 
> Because, the architecturally specified ones won't be type NOCACHE?

Correct.  However, what if you have a NOCACHE (not architecturally 
specified), that is fully described in PPTT, as a non-unified cache 
(data only)?  Unlikely?  Maybe.  Still seem possible though, therefore I 
feel this assumption is suspect.

> 
>>
>>> What I suspect is happening in the reported case is that the nodes in 
>>> the PPTT table are missing fields we consider to be important. Since 
>>> that data isn't being filled out anywhere else, so we leave the cache 
>>> type alone too. This has the effect of hiding sysfs nodes with 
>>> incomplete information.
>>>
>>> Also, the lack of the DATA/INST fields is based on the assumption 
>>> that the only nodes which need their type field updated are outside 
>>> of the CPU core itself so they are pretty much guaranteed to be 
>>> UNIFIED. Are you hitting this case?
>>>
>>
>> Yes.  Without this change, we hit the lscpu error in the commit 
>> message, and get zero output about the system.  We don't even get 
>> information about the caches which are architecturally specified or 
>> how many cpus are present.  With this change, we get what we expect 
>> out of lscpu (and also lstopo) including the cache(s) which are not 
>> architecturally specified.
> 
> I'm a bit surprised this changes the behavior of the architecturally 
> specified ones. As I mentioned above, those shouldn't be NOCACHE. We use 
> the level/type as a key for matching a PPTT node to an architecturally 
> described cache. If that is mismatched, something more fundamental is 
> happening. About the only case I can think of that can cause this is if 
> the CLIDR type fields are incorrect.
> 
> In which case I might suggest we move your switch() for the INST/DATA 
> into the check below that comment.

This change was not intended to impact architecturally specified ones, 
however I can see how that is the case.  That would be the case if the 
architecturally specified type is X and PPTT indicates Y.  According to 
the PPTT spec, the cache should be treated as type Y then (ie the 
contents of the PPTT override the architectural mechanisms).

I can move my change below the current NOCACHE handling, which would be 
valid for my scenario, but it seems odd.  If I do that, then the current 
assumption will take priority.  IE, if a cache is "fully specified" in 
PPTT, but is NOCACHE, then it will be treated as unified, regardless of 
what PPTT says (maybe instruction, or data only).

> 
>>
>> I guess I still don't understand why its important for PPTT to list, 
>> for example, the sets/ways of a cache in all scenarios.  In the case 
>> of a "transparent" cache (implementation defined as not reported per 
>> section D3.4.2 of the ARM ARM where the cache cannot be managed by 
>> SW), there may not be valid values for sets/ways.  I would argue its 
>> better to not report that information, rather than provide bogus 
>> information just to make Linux happy, which may break other OSes and 
>> provide means for which a user to hang themselves.
> 
> This doesn't really have anything to do with the Arm/ARM's definition of 
> set/way as it pertains to cache maint. Its more to assist userspace 
> software with an understanding of the system cache architecture so it 
> can make intelligent decisions about loop tiling and the like.
> 
> What we want is a consistent dependable view for software to utilize. An 
> unreliable sysfs field is one that tends not to get used.
I guess my argument is this -
The cache is not specified architecturally because it cannot be managed 
by software (see the ARM ARM section I referenced).

However, its important that the user be able to "see" it, I'm not a 
marketing guy, but I assume its going to be listed on the "data sheet".

Therefore, the firmware is providing some information about the cache 
(via SMBIOS tables and PPTT).

The HW designers have indicated that there is no sane way to provide 
sets/ways information to software, even on an informational basis (ie 
not for cache maintenance, but for performance optimizations). 
Therefore the firmware will not provide this information because it will 
be wrong.

So, therefore, we should still be able to tell the user that a cache 
exists at the relevant level, and what size it is.  On the concerned 
system, we cannot do that currently.

> 
>>
>> However, in the case of a transparent cache, the size/type/level/write 
>> policy/etc (whatever the firmware provider deems relevant) might be 
>> valuable information for the OS to make scheduling decisions, and is 
>> certainly valuable for user space utilities for cross-machine/data ou
>> center level job scheduling.
>>


-- 
Jeffrey Hugo
Qualcomm Datacenter Technologies as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] ACPI/PPTT: Handle architecturally unknown cache types
  2018-09-12 10:49     ` Sudeep Holla
@ 2018-09-12 14:48       ` Jeffrey Hugo
  2018-09-12 15:32         ` Sudeep Holla
  2018-09-13  5:51       ` Brice Goglin
  1 sibling, 1 reply; 22+ messages in thread
From: Jeffrey Hugo @ 2018-09-12 14:48 UTC (permalink / raw)
  To: Sudeep Holla, Jeremy Linton, rjw, linux-acpi; +Cc: linux-kernel, vkilari

On 9/12/2018 4:49 AM, Sudeep Holla wrote:
> 
> 
> On 11/09/18 21:38, Jeffrey Hugo wrote:
>> On 9/11/2018 2:16 PM, Jeremy Linton wrote:
>>> Hi Jeffrey,
>>>
>>> (+Sudeep)
>>>
> 
> [..]
> 
>>>
>>> If you look at the next line of code following this comment its going
>>> to update the cache type for fully populated PPTT nodes. Although with
>>> the suggested change its only going to activate if someone completely
>>> fills out the node and fails to set the valid flag on the cache type.
>>
>> Yes, however that case doesn't apply to the scenario we are concerned
>> about, doesn't seem to be fully following the PPTT spec, and seems odd
>> that Linux just assumes that a "fully specified" cache is unified.
>>
> 
> Agreed, but adding code that will never get used is also not so good.
> Do you have systems where this is needed ?

Yes.

> 
>>> What I suspect is happening in the reported case is that the nodes in
>>> the PPTT table are missing fields we consider to be important. Since
>>> that data isn't being filled out anywhere else, so we leave the cache
>>> type alone too. This has the effect of hiding sysfs nodes with
>>> incomplete information.
>>>
>>> Also, the lack of the DATA/INST fields is based on the assumption that
>>> the only nodes which need their type field updated are outside of the
>>> CPU core itself so they are pretty much guaranteed to be UNIFIED. Are
>>> you hitting this case?
>>>
>>
>> Yes.  Without this change, we hit the lscpu error in the commit message,
>> and get zero output about the system.  We don't even get information
>> about the caches which are architecturally specified or how many cpus
>> are present.  With this change, we get what we expect out of lscpu (and
>> also lstopo) including the cache(s) which are not architecturally
>> specified.
>>
> 
> lscpu and lstopo are so broken. They just assume everything on CPU0.
> If you hotplug them out, you start seeing issues. So reading and file
> that doesn't exist and then bail out on other essential info though they
> are present, hmmm ...

lscpu/lstopo being broken seems orthogonal to me.
The kernel is not providing the type file because the kernel thinks the 
type is NOCACHE, despite PPTT providing a type.  In an ideal world, I 
think the kernel should be fixed (this change), and any deficiencies or 
bad assumptions in lscpu/lstopo should also be fixed.

> 
>> I guess I still don't understand why its important for PPTT to list, for
>> example, the sets/ways of a cache in all scenarios.  In the case of a
>> "transparent" cache (implementation defined as not reported per section
>> D3.4.2 of the ARM ARM where the cache cannot be managed by SW), there
>> may not be valid values for sets/ways.  I would argue its better to not
>> report that information, rather than provide bogus information just to
>> make Linux happy, which may break other OSes and provide means for which
>> a user to hang themselves.
>>
> 
> While I agree presenting wrong info is not good, but one of the reasons
> the cache info is present is to provide those info for some applications
> to do optimization based on that(I am told so and not sure if just type
> and size will be good enough) and you seem to agree with that below.

Yes, but if its not entirely clear, on my system, we have the 
information but its not being presented.  This change fixes that.  I'm 
willing to discuss an alternative fix, but to ignore the issue is not 
viable in my opinion.

What do we need to move forward here?

> 
>> However, in the case of a transparent cache, the size/type/level/write
>> policy/etc (whatever the firmware provider deems relevant) might be
>> valuable information for the OS to make scheduling decisions, and is
>> certainly valuable for user space utilities for cross-machine/data
>> center level job scheduling.
>>
> 


-- 
Jeffrey Hugo
Qualcomm Datacenter Technologies as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] ACPI/PPTT: Handle architecturally unknown cache types
  2018-09-12 14:41       ` Jeffrey Hugo
@ 2018-09-12 15:27         ` Sudeep Holla
  2018-09-12 15:38           ` Sudeep Holla
  2018-09-12 15:39         ` Jeremy Linton
  1 sibling, 1 reply; 22+ messages in thread
From: Sudeep Holla @ 2018-09-12 15:27 UTC (permalink / raw)
  To: Jeffrey Hugo, Jeremy Linton, rjw, linux-acpi
  Cc: Sudeep Holla, linux-kernel, vkilari



On 12/09/18 15:41, Jeffrey Hugo wrote:
> On 9/11/2018 3:25 PM, Jeremy Linton wrote:
>> Hi,
>>
>> On 09/11/2018 03:38 PM, Jeffrey Hugo wrote:
>>> On 9/11/2018 2:16 PM, Jeremy Linton wrote:
>>>> Hi Jeffrey,
>>>>
>>>> (+Sudeep)
>>>>
>>>> On 09/11/2018 02:32 PM, Jeffrey Hugo wrote:
>>>>> The type of a cache might not be specified by architectural
>>>>> mechanisms (ie
>>>>> system registers), but its type might be specified in the PPTT.  In
>>>>> this
>>>>> case, following the PPTT specification, we should identify the
>>>>> cache as
>>>>> the type specified by PPTT.
>>>>>
>>>>> This fixes the following lscpu issue where only the cache type
>>>>> sysfs file
>>>>> is missing which results in no output providing a poor user
>>>>> experience in
>>>>> the above system configuration-
>>>>> lscpu: cannot open /sys/devices/system/cpu/cpu0/cache/index3/type:
>>>>> No such
>>>>> file or directory
>>>>>
>>>>> Fixes: 2bd00bcd73e5 (ACPI/PPTT: Add Processor Properties Topology
>>>>> Table parsing)
>>>>> Reported-by: Vijaya Kumar K <vkilari@codeaurora.org>
>>>>> Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org>
>>>>> ---
>>>>>   drivers/acpi/pptt.c | 15 +++++++++++++++
>>>>>   1 file changed, 15 insertions(+)
>>>>>
>>>>> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
>>>>> index d1e26cb..3c6db09 100644
>>>>> --- a/drivers/acpi/pptt.c
>>>>> +++ b/drivers/acpi/pptt.c
>>>>> @@ -401,6 +401,21 @@ static void update_cache_properties(struct
>>>>> cacheinfo *this_leaf,
>>>>>               break;
>>>>>           }
>>>>>       }
>>>>> +    if ((this_leaf->type == CACHE_TYPE_NOCACHE) &&
>>>>> +        (found_cache->flags & ACPI_PPTT_CACHE_TYPE_VALID)) {
>>>>> +        switch (found_cache->attributes &
>>>>> ACPI_PPTT_MASK_CACHE_TYPE) {
>>>>> +        case ACPI_PPTT_CACHE_TYPE_DATA:
>>>>> +            this_leaf->type = CACHE_TYPE_DATA;
>>>>> +            break;
>>>>> +        case ACPI_PPTT_CACHE_TYPE_INSTR:
>>>>> +            this_leaf->type = CACHE_TYPE_INST;
>>>>> +            break;
>>>>> +        case ACPI_PPTT_CACHE_TYPE_UNIFIED:
>>>>> +        case ACPI_PPTT_CACHE_TYPE_UNIFIED_ALT:
>>>>> +            this_leaf->type = CACHE_TYPE_UNIFIED;
>>>>> +            break;
>>>>> +        }
>>>>> +    }
>>>>>       /*
>>>>>        * If the above flags are valid, and the cache type is NOCACHE
>>>>>        * update the cache type as well.
>>>>>
>>>>
>>>> If you look at the next line of code following this comment its
>>>> going to update the cache type for fully populated PPTT nodes.
>>>> Although with the suggested change its only going to activate if
>>>> someone completely fills out the node and fails to set the valid
>>>> flag on the cache type. 
>>>
>>> Yes, however that case doesn't apply to the scenario we are concerned
>>> about, doesn't seem to be fully following the PPTT spec, and seems
>>> odd that Linux just assumes that a "fully specified" cache is unified.
>>
>> Because, the architecturally specified ones won't be type NOCACHE?
> 
> Correct.  However, what if you have a NOCACHE (not architecturally
> specified), that is fully described in PPTT, as a non-unified cache
> (data only)?  Unlikely?  Maybe.  Still seem possible though, therefore I
> feel this assumption is suspect.
> 

Yes, we have other issues if the architecturally not specified cache is
not unified irrespective of what PPTT says. So we may need to review and
see if that assumption is removed everywhere.

Until then why can't a simple change fix the issue you have:

-->8

diff --git i/drivers/acpi/pptt.c w/drivers/acpi/pptt.c
index d1e26cb599bf..f74131201f5e 100644
--- i/drivers/acpi/pptt.c
+++ w/drivers/acpi/pptt.c
@@ -406,7 +406,8 @@ static void update_cache_properties(struct cacheinfo
*this_leaf,
         * update the cache type as well.
         */
        if (this_leaf->type == CACHE_TYPE_NOCACHE &&
-           valid_flags == PPTT_CHECKED_ATTRIBUTES)
+           (valid_flags == PPTT_CHECKED_ATTRIBUTES ||
+            found_cache->flags & ACPI_PPTT_CACHE_TYPE_VALID))
                this_leaf->type = CACHE_TYPE_UNIFIED;
 }

[...]

>>>
>>> Yes.  Without this change, we hit the lscpu error in the commit
>>> message, and get zero output about the system.  We don't even get
>>> information about the caches which are architecturally specified or
>>> how many cpus are present.

The above issues is purely lscpu to blame and nothing to do with kernel
change. If kernel doesn't report level 3 cache info and lscpu fails
to report level 1/2 cache info and kernel is to blame here ? That's unfair.

>>> With this change, we get what we expect
>>> out of lscpu (and also lstopo) including the cache(s) which are not
>>> architecturally specified.

Ofcourse, but we can't attribute that as kernel bug. I agree we should
not show incorrect info for cache type and that needs to be fixed but
disagree with lscpu issue as kernel issue. Kernel is not breaking any
ABI as cacheinfo itself is optional and the absence of it needs to be
dealt with.

>> I'm a bit surprised this changes the behavior of the architecturally
>> specified ones. As I mentioned above, those shouldn't be NOCACHE. We
>> use the level/type as a key for matching a PPTT node to an
>> architecturally described cache. If that is mismatched, something more
>> fundamental is happening. About the only case I can think of that can
>> cause this is if the CLIDR type fields are incorrect.
>>
>> In which case I might suggest we move your switch() for the INST/DATA
>> into the check below that comment.
> 
> This change was not intended to impact architecturally specified ones,
> however I can see how that is the case.  That would be the case if the
> architecturally specified type is X and PPTT indicates Y.  According to
> the PPTT spec, the cache should be treated as type Y then (ie the
> contents of the PPTT override the architectural mechanisms).
> 
> I can move my change below the current NOCACHE handling, which would be
> valid for my scenario, but it seems odd.  If I do that, then the current
> assumption will take priority.  IE, if a cache is "fully specified" in
> PPTT, but is NOCACHE, then it will be treated as unified, regardless of
> what PPTT says (maybe instruction, or data only).
> 

Yes it should be that way for above mentioned reasons. You need to fix
other things if you want to support separate data and inst cache for
architecturally unspecified cache.


[...]

> 
> However, its important that the user be able to "see" it, I'm not a
> marketing guy, but I assume its going to be listed on the "data sheet".
> 
> Therefore, the firmware is providing some information about the cache
> (via SMBIOS tables and PPTT).
> 
> The HW designers have indicated that there is no sane way to provide
> sets/ways information to software, even on an informational basis (ie
> not for cache maintenance, but for performance optimizations). Therefore
> the firmware will not provide this information because it will be wrong.
> 

Fair enough.

> So, therefore, we should still be able to tell the user that a cache
> exists at the relevant level, and what size it is.  On the concerned
> system, we cannot do that currently.
> 

Again agreed.

-- 
Regards,
Sudeep

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH] ACPI/PPTT: Handle architecturally unknown cache types
  2018-09-12 14:48       ` Jeffrey Hugo
@ 2018-09-12 15:32         ` Sudeep Holla
  0 siblings, 0 replies; 22+ messages in thread
From: Sudeep Holla @ 2018-09-12 15:32 UTC (permalink / raw)
  To: Jeffrey Hugo, Jeremy Linton, rjw, linux-acpi
  Cc: Sudeep Holla, linux-kernel, vkilari



On 12/09/18 15:48, Jeffrey Hugo wrote:
> On 9/12/2018 4:49 AM, Sudeep Holla wrote:
>>
>>
>> On 11/09/18 21:38, Jeffrey Hugo wrote:
>>> On 9/11/2018 2:16 PM, Jeremy Linton wrote:
>>>> Hi Jeffrey,
>>>>
>>>> (+Sudeep)
>>>>
>>
>> [..]
>>
>>>>
>>>> If you look at the next line of code following this comment its going
>>>> to update the cache type for fully populated PPTT nodes. Although with
>>>> the suggested change its only going to activate if someone completely
>>>> fills out the node and fails to set the valid flag on the cache type.
>>>
>>> Yes, however that case doesn't apply to the scenario we are concerned
>>> about, doesn't seem to be fully following the PPTT spec, and seems odd
>>> that Linux just assumes that a "fully specified" cache is unified.
>>>
>>
>> Agreed, but adding code that will never get used is also not so good.
>> Do you have systems where this is needed ?
> 
> Yes.
> 

Not exactly IMO. I was referring to architecturally not specified
separate data/inst cache.

>>
>>>> What I suspect is happening in the reported case is that the nodes in
>>>> the PPTT table are missing fields we consider to be important. Since
>>>> that data isn't being filled out anywhere else, so we leave the cache
>>>> type alone too. This has the effect of hiding sysfs nodes with
>>>> incomplete information.
>>>>
>>>> Also, the lack of the DATA/INST fields is based on the assumption that
>>>> the only nodes which need their type field updated are outside of the
>>>> CPU core itself so they are pretty much guaranteed to be UNIFIED. Are
>>>> you hitting this case?
>>>>
>>>
>>> Yes.  Without this change, we hit the lscpu error in the commit message,
>>> and get zero output about the system.  We don't even get information
>>> about the caches which are architecturally specified or how many cpus
>>> are present.  With this change, we get what we expect out of lscpu (and
>>> also lstopo) including the cache(s) which are not architecturally
>>> specified.
>>>
>>
>> lscpu and lstopo are so broken. They just assume everything on CPU0.
>> If you hotplug them out, you start seeing issues. So reading and file
>> that doesn't exist and then bail out on other essential info though they
>> are present, hmmm ...
> 
> lscpu/lstopo being broken seems orthogonal to me.

Agreed.

> The kernel is not providing the type file because the kernel thinks the
> type is NOCACHE, despite PPTT providing a type.  In an ideal world, I
> think the kernel should be fixed (this change), and any deficiencies or
> bad assumptions in lscpu/lstopo should also be fixed.
> 

Again agreed as mentioned in the other mail. I just didn't like the
attribution of that issue to this kernel bug.

>>
>>> I guess I still don't understand why its important for PPTT to list, for
>>> example, the sets/ways of a cache in all scenarios.  In the case of a
>>> "transparent" cache (implementation defined as not reported per section
>>> D3.4.2 of the ARM ARM where the cache cannot be managed by SW), there
>>> may not be valid values for sets/ways.  I would argue its better to not
>>> report that information, rather than provide bogus information just to
>>> make Linux happy, which may break other OSes and provide means for which
>>> a user to hang themselves.
>>>
>>
>> While I agree presenting wrong info is not good, but one of the reasons
>> the cache info is present is to provide those info for some applications
>> to do optimization based on that(I am told so and not sure if just type
>> and size will be good enough) and you seem to agree with that below.
> 
> Yes, but if its not entirely clear, on my system, we have the
> information but its not being presented.  This change fixes that.  I'm
> willing to discuss an alternative fix, but to ignore the issue is not
> viable in my opinion.
> 

Agreed.

> What do we need to move forward here?
> 

Will the simple change I posted address the issue ? I don't want to give
an illusion that separate data/inst cache is support for architecturally
not specified caches. If it needs to be supported, then it needs to be
fixed correctly everywhere not just here.

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] ACPI/PPTT: Handle architecturally unknown cache types
  2018-09-12 15:27         ` Sudeep Holla
@ 2018-09-12 15:38           ` Sudeep Holla
  2018-09-12 15:57             ` Jeffrey Hugo
  0 siblings, 1 reply; 22+ messages in thread
From: Sudeep Holla @ 2018-09-12 15:38 UTC (permalink / raw)
  To: Jeffrey Hugo, Jeremy Linton, rjw, linux-acpi
  Cc: Sudeep Holla, linux-kernel, vkilari



On 12/09/18 16:27, Sudeep Holla wrote:
> 
> 
> On 12/09/18 15:41, Jeffrey Hugo wrote:

[...]

>>
>> Correct.  However, what if you have a NOCACHE (not architecturally
>> specified), that is fully described in PPTT, as a non-unified cache
>> (data only)?  Unlikely?  Maybe.  Still seem possible though, therefore I
>> feel this assumption is suspect.
>>
> 
> Yes, we have other issues if the architecturally not specified cache is
> not unified irrespective of what PPTT says. So we may need to review and
> see if that assumption is removed everywhere.
> 
> Until then why can't a simple change fix the issue you have:
> 
> -->8
> 
> diff --git i/drivers/acpi/pptt.c w/drivers/acpi/pptt.c
> index d1e26cb599bf..f74131201f5e 100644
> --- i/drivers/acpi/pptt.c
> +++ w/drivers/acpi/pptt.c
> @@ -406,7 +406,8 @@ static void update_cache_properties(struct cacheinfo
> *this_leaf,
>          * update the cache type as well.
>          */
>         if (this_leaf->type == CACHE_TYPE_NOCACHE &&
> -           valid_flags == PPTT_CHECKED_ATTRIBUTES)
> +           (valid_flags == PPTT_CHECKED_ATTRIBUTES ||
> +            found_cache->flags & ACPI_PPTT_CACHE_TYPE_VALID))

Looking at this again, if we are supporting just presence of cache type
and size(may be), then we can drop the whole valid_flags thing here.

>                 this_leaf->type = CACHE_TYPE_UNIFIED;
>  }
> 
-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] ACPI/PPTT: Handle architecturally unknown cache types
  2018-09-12 14:41       ` Jeffrey Hugo
  2018-09-12 15:27         ` Sudeep Holla
@ 2018-09-12 15:39         ` Jeremy Linton
  2018-09-12 16:06           ` Jeffrey Hugo
  1 sibling, 1 reply; 22+ messages in thread
From: Jeremy Linton @ 2018-09-12 15:39 UTC (permalink / raw)
  To: Jeffrey Hugo, rjw, linux-acpi; +Cc: linux-kernel, vkilari, Sudeep Holla

Hi,


On 09/12/2018 09:41 AM, Jeffrey Hugo wrote:
> The HW designers have indicated that there is no sane way to provide 
> sets/ways information to software, even on an informational basis (ie 
> not for cache maintenance, but for performance optimizations). Therefore 
> the firmware will not provide this information because it will be wrong.
> 
> So, therefore, we should still be able to tell the user that a cache 
> exists at the relevant level, and what size it is.  On the concerned 
> system, we cannot do that currently.

Ok, so set the fields to zero in firmware node, and mark them valid.

That logically says that there isn't any set/way information for the 
cache (which implies direct mapped).


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] ACPI/PPTT: Handle architecturally unknown cache types
  2018-09-12 15:38           ` Sudeep Holla
@ 2018-09-12 15:57             ` Jeffrey Hugo
  2018-09-12 16:15               ` Sudeep Holla
  0 siblings, 1 reply; 22+ messages in thread
From: Jeffrey Hugo @ 2018-09-12 15:57 UTC (permalink / raw)
  To: Sudeep Holla, Jeremy Linton, rjw, linux-acpi; +Cc: linux-kernel, vkilari

On 9/12/2018 9:38 AM, Sudeep Holla wrote:
> 
> 
> On 12/09/18 16:27, Sudeep Holla wrote:
>>
>>
>> On 12/09/18 15:41, Jeffrey Hugo wrote:
> 
> [...]
> 
>>>
>>> Correct.  However, what if you have a NOCACHE (not architecturally
>>> specified), that is fully described in PPTT, as a non-unified cache
>>> (data only)?  Unlikely?  Maybe.  Still seem possible though, therefore I
>>> feel this assumption is suspect.
>>>
>>
>> Yes, we have other issues if the architecturally not specified cache is
>> not unified irrespective of what PPTT says. So we may need to review and
>> see if that assumption is removed everywhere.
>>
>> Until then why can't a simple change fix the issue you have:
>>
>> -->8
>>
>> diff --git i/drivers/acpi/pptt.c w/drivers/acpi/pptt.c
>> index d1e26cb599bf..f74131201f5e 100644
>> --- i/drivers/acpi/pptt.c
>> +++ w/drivers/acpi/pptt.c
>> @@ -406,7 +406,8 @@ static void update_cache_properties(struct cacheinfo
>> *this_leaf,
>>           * update the cache type as well.
>>           */
>>          if (this_leaf->type == CACHE_TYPE_NOCACHE &&
>> -           valid_flags == PPTT_CHECKED_ATTRIBUTES)
>> +           (valid_flags == PPTT_CHECKED_ATTRIBUTES ||
>> +            found_cache->flags & ACPI_PPTT_CACHE_TYPE_VALID))
> 
> Looking at this again, if we are supporting just presence of cache type
> and size(may be), then we can drop the whole valid_flags thing here.
> 
>>                  this_leaf->type = CACHE_TYPE_UNIFIED;
>>   }
>>

Yes, this change fixes my usecase.  I think it invalidates the comment, 
and really, the comment should probably mention that we assume unified 
type because there are other issues in supporting architecturally not 
specified inst/data only caches.

Do you want a V2 with this?  If so, do you want the fixes tag removed 
since you seem to view this as not a bug?

I don't think I clearly understand the purpose of the valid flags, 
therefore I feel as though I'm not sure if it can be dropped or not.  Is 
it fair to say that what the valid flags is accomplishing is identifying 
if we have a sufficient level of information to support this cache?  If 
not, then should the cacheinfo driver not expose any sysfs information 
about the cache?

-- 
Jeffrey Hugo
Qualcomm Datacenter Technologies as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] ACPI/PPTT: Handle architecturally unknown cache types
  2018-09-12 15:39         ` Jeremy Linton
@ 2018-09-12 16:06           ` Jeffrey Hugo
  0 siblings, 0 replies; 22+ messages in thread
From: Jeffrey Hugo @ 2018-09-12 16:06 UTC (permalink / raw)
  To: Jeremy Linton, rjw, linux-acpi; +Cc: linux-kernel, vkilari, Sudeep Holla

On 9/12/2018 9:39 AM, Jeremy Linton wrote:
> Hi,
> 
> 
> On 09/12/2018 09:41 AM, Jeffrey Hugo wrote:
>> The HW designers have indicated that there is no sane way to provide 
>> sets/ways information to software, even on an informational basis (ie 
>> not for cache maintenance, but for performance optimizations). 
>> Therefore the firmware will not provide this information because it 
>> will be wrong.
>>
>> So, therefore, we should still be able to tell the user that a cache 
>> exists at the relevant level, and what size it is.  On the concerned 
>> system, we cannot do that currently.
> 
> Ok, so set the fields to zero in firmware node, and mark them valid.

Is that what the PPTT spec says to do?

> That logically says that there isn't any set/way information for the 
> cache (which implies direct mapped).

Making inferences such as that have gotten folks into trouble when 
interpreting other specs.  Unfortunately I am not allowed to detail more 
about this specific system, however implying that the affected cache(s) 
are direct mapped is incorrect.  Officially, what you have is a cache or 
multiple caches at certain levels that have a specified size.  You can 
make no inferences about the exact behavior or implementation of the 
cache beyond what FW explicitly provides.  I'm not particularly a fan of 
it, but its what I have to deal with.

-- 
Jeffrey Hugo
Qualcomm Datacenter Technologies as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] ACPI/PPTT: Handle architecturally unknown cache types
  2018-09-12 15:57             ` Jeffrey Hugo
@ 2018-09-12 16:15               ` Sudeep Holla
  2018-09-12 16:25                 ` Jeffrey Hugo
  0 siblings, 1 reply; 22+ messages in thread
From: Sudeep Holla @ 2018-09-12 16:15 UTC (permalink / raw)
  To: Jeffrey Hugo
  Cc: Jeremy Linton, rjw, linux-acpi, linux-kernel, vkilari, Sudeep Holla

On Wed, Sep 12, 2018 at 09:57:14AM -0600, Jeffrey Hugo wrote:
> On 9/12/2018 9:38 AM, Sudeep Holla wrote:
> >
> >
> >On 12/09/18 16:27, Sudeep Holla wrote:
> >>
> >>
> >>On 12/09/18 15:41, Jeffrey Hugo wrote:
> >
> >[...]
> >
> >>>
> >>>Correct.  However, what if you have a NOCACHE (not architecturally
> >>>specified), that is fully described in PPTT, as a non-unified cache
> >>>(data only)?  Unlikely?  Maybe.  Still seem possible though, therefore I
> >>>feel this assumption is suspect.
> >>>
> >>
> >>Yes, we have other issues if the architecturally not specified cache is
> >>not unified irrespective of what PPTT says. So we may need to review and
> >>see if that assumption is removed everywhere.
> >>
> >>Until then why can't a simple change fix the issue you have:
> >>
> >>-->8
> >>
> >>diff --git i/drivers/acpi/pptt.c w/drivers/acpi/pptt.c
> >>index d1e26cb599bf..f74131201f5e 100644
> >>--- i/drivers/acpi/pptt.c
> >>+++ w/drivers/acpi/pptt.c
> >>@@ -406,7 +406,8 @@ static void update_cache_properties(struct cacheinfo
> >>*this_leaf,
> >>          * update the cache type as well.
> >>          */
> >>         if (this_leaf->type == CACHE_TYPE_NOCACHE &&
> >>-           valid_flags == PPTT_CHECKED_ATTRIBUTES)
> >>+           (valid_flags == PPTT_CHECKED_ATTRIBUTES ||
> >>+            found_cache->flags & ACPI_PPTT_CACHE_TYPE_VALID))
> >
> >Looking at this again, if we are supporting just presence of cache type
> >and size(may be), then we can drop the whole valid_flags thing here.
> >
> >>                 this_leaf->type = CACHE_TYPE_UNIFIED;
> >>  }
> >>
>
> Yes, this change fixes my usecase.  I think it invalidates the comment, and
> really, the comment should probably mention that we assume unified type
> because there are other issues in supporting architecturally not specified
> inst/data only caches.
>

Agreed.

> Do you want a V2 with this?  If so, do you want the fixes tag removed since
> you seem to view this as not a bug?
>

Yes please, I am fine to retain fixes tag but that's my opinion.

> I don't think I clearly understand the purpose of the valid flags, therefore
> I feel as though I'm not sure if it can be dropped or not.  Is it fair to
> say that what the valid flags is accomplishing is identifying if we have a
> sufficient level of information to support this cache?  If not, then should
> the cacheinfo driver not expose any sysfs information about the cache?
>

I don't see the use of the flag if we *have to* support the case where
all the cache geometry is not present but just cache type (and maybe
size?) is present. If that's the case we can drop valid flags entirely.
I really don't like the idea of supporting that, but I don't have strong
reasons to defend my idea, so I am fine with that.

Further, I think in your case with NOCACHE type set, sysfs dir shouldn't
have been created at the first place ideally. I think we need something
like below to fix that.

-->8

diff --git i/drivers/base/cacheinfo.c w/drivers/base/cacheinfo.c
index 5d5b5988e88b..cf78fa6d470d 100644
--- i/drivers/base/cacheinfo.c
+++ w/drivers/base/cacheinfo.c
@@ -615,6 +615,8 @@ static int cache_add_dev(unsigned int cpu)
 		this_leaf = this_cpu_ci->info_list + i;
 		if (this_leaf->disable_sysfs)
 			continue;
+		if (this_leaf->type == CACHE_TYPE_NOCACHE)
+			break;
 		cache_groups = cache_get_attribute_groups(this_leaf);
 		ci_dev = cpu_device_create(parent, this_leaf, cache_groups,
 					   "index%1u", i);

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH] ACPI/PPTT: Handle architecturally unknown cache types
  2018-09-12 16:15               ` Sudeep Holla
@ 2018-09-12 16:25                 ` Jeffrey Hugo
  0 siblings, 0 replies; 22+ messages in thread
From: Jeffrey Hugo @ 2018-09-12 16:25 UTC (permalink / raw)
  To: Sudeep Holla; +Cc: Jeremy Linton, rjw, linux-acpi, linux-kernel, vkilari

On 9/12/2018 10:15 AM, Sudeep Holla wrote:
> On Wed, Sep 12, 2018 at 09:57:14AM -0600, Jeffrey Hugo wrote:
>> On 9/12/2018 9:38 AM, Sudeep Holla wrote:
>>>
>>>
>>> On 12/09/18 16:27, Sudeep Holla wrote:
>>>>
>>>>
>>>> On 12/09/18 15:41, Jeffrey Hugo wrote:
>>>
>>> [...]
>>>
>>>>>
>>>>> Correct.  However, what if you have a NOCACHE (not architecturally
>>>>> specified), that is fully described in PPTT, as a non-unified cache
>>>>> (data only)?  Unlikely?  Maybe.  Still seem possible though, therefore I
>>>>> feel this assumption is suspect.
>>>>>
>>>>
>>>> Yes, we have other issues if the architecturally not specified cache is
>>>> not unified irrespective of what PPTT says. So we may need to review and
>>>> see if that assumption is removed everywhere.
>>>>
>>>> Until then why can't a simple change fix the issue you have:
>>>>
>>>> -->8
>>>>
>>>> diff --git i/drivers/acpi/pptt.c w/drivers/acpi/pptt.c
>>>> index d1e26cb599bf..f74131201f5e 100644
>>>> --- i/drivers/acpi/pptt.c
>>>> +++ w/drivers/acpi/pptt.c
>>>> @@ -406,7 +406,8 @@ static void update_cache_properties(struct cacheinfo
>>>> *this_leaf,
>>>>           * update the cache type as well.
>>>>           */
>>>>          if (this_leaf->type == CACHE_TYPE_NOCACHE &&
>>>> -           valid_flags == PPTT_CHECKED_ATTRIBUTES)
>>>> +           (valid_flags == PPTT_CHECKED_ATTRIBUTES ||
>>>> +            found_cache->flags & ACPI_PPTT_CACHE_TYPE_VALID))
>>>
>>> Looking at this again, if we are supporting just presence of cache type
>>> and size(may be), then we can drop the whole valid_flags thing here.
>>>
>>>>                  this_leaf->type = CACHE_TYPE_UNIFIED;
>>>>   }
>>>>
>>
>> Yes, this change fixes my usecase.  I think it invalidates the comment, and
>> really, the comment should probably mention that we assume unified type
>> because there are other issues in supporting architecturally not specified
>> inst/data only caches.
>>
> 
> Agreed.
> 
>> Do you want a V2 with this?  If so, do you want the fixes tag removed since
>> you seem to view this as not a bug?
>>
> 
> Yes please, I am fine to retain fixes tag but that's my opinion.
> 
>> I don't think I clearly understand the purpose of the valid flags, therefore
>> I feel as though I'm not sure if it can be dropped or not.  Is it fair to
>> say that what the valid flags is accomplishing is identifying if we have a
>> sufficient level of information to support this cache?  If not, then should
>> the cacheinfo driver not expose any sysfs information about the cache?
>>
> 
> I don't see the use of the flag if we *have to* support the case where
> all the cache geometry is not present but just cache type (and maybe
> size?) is present. If that's the case we can drop valid flags entirely.
> I really don't like the idea of supporting that, but I don't have strong
> reasons to defend my idea, so I am fine with that.
> 
> Further, I think in your case with NOCACHE type set, sysfs dir shouldn't
> have been created at the first place ideally. I think we need something
> like below to fix that.
> 
> -->8
> 
> diff --git i/drivers/base/cacheinfo.c w/drivers/base/cacheinfo.c
> index 5d5b5988e88b..cf78fa6d470d 100644
> --- i/drivers/base/cacheinfo.c
> +++ w/drivers/base/cacheinfo.c
> @@ -615,6 +615,8 @@ static int cache_add_dev(unsigned int cpu)
>   		this_leaf = this_cpu_ci->info_list + i;
>   		if (this_leaf->disable_sysfs)
>   			continue;
> +		if (this_leaf->type == CACHE_TYPE_NOCACHE)
> +			break;
>   		cache_groups = cache_get_attribute_groups(this_leaf);
>   		ci_dev = cpu_device_create(parent, this_leaf, cache_groups,
>   					   "index%1u", i);
> 

Ok, let me test this out, and send out a v2.

-- 
Jeffrey Hugo
Qualcomm Datacenter Technologies as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] ACPI/PPTT: Handle architecturally unknown cache types
  2018-09-12 10:49     ` Sudeep Holla
  2018-09-12 14:48       ` Jeffrey Hugo
@ 2018-09-13  5:51       ` Brice Goglin
  2018-09-13  9:39         ` James Morse
  1 sibling, 1 reply; 22+ messages in thread
From: Brice Goglin @ 2018-09-13  5:51 UTC (permalink / raw)
  To: Sudeep Holla, Jeffrey Hugo, Jeremy Linton, rjw, linux-acpi
  Cc: linux-kernel, vkilari

Le 12/09/2018 à 11:49, Sudeep Holla a écrit :
>
>> Yes.  Without this change, we hit the lscpu error in the commit message,
>> and get zero output about the system.  We don't even get information
>> about the caches which are architecturally specified or how many cpus
>> are present.  With this change, we get what we expect out of lscpu (and
>> also lstopo) including the cache(s) which are not architecturally
>> specified.
>>
> lscpu and lstopo are so broken. They just assume everything on CPU0.
> If you hotplug them out, you start seeing issues. So reading and file
> that doesn't exist and then bail out on other essential info though they
> are present, hmmm ...

Can you elaborate?

I am not sure cpu0 is supposed to be offlineable on Linux. There's no
"online" file in /sys/devices/system/cpu/cpu0. That's why former lstopo
doesn't like CPU0 being hotplugged out. We are actually making that case
work for another non-standard corner case. But offlining "cpu0" this is
considered "normal", somebody must add that missing "online" sysfs
attribute for "cpu0" (change
https://elixir.bootlin.com/linux/latest/source/drivers/base/cpu.c#L375).

By the way, did anybody actually see an error with lstopo when there's
no "type" attribute for L3? I can't reproduce any issue, we just skip
that specific cache entirely, but everything else appears. If you guys
want to make that "no_cache" cache appear, I'll make it a Unified cache
unless you tell me what to show :)

Brice

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] ACPI/PPTT: Handle architecturally unknown cache types
  2018-09-13  5:51       ` Brice Goglin
@ 2018-09-13  9:39         ` James Morse
  2018-09-13 10:35           ` Sudeep Holla
  0 siblings, 1 reply; 22+ messages in thread
From: James Morse @ 2018-09-13  9:39 UTC (permalink / raw)
  To: Brice Goglin
  Cc: Sudeep Holla, Jeffrey Hugo, Jeremy Linton, rjw, linux-acpi,
	linux-kernel, vkilari

Hi Brice,

On 13/09/18 06:51, Brice Goglin wrote:
> Le 12/09/2018 à 11:49, Sudeep Holla a écrit :
>>> Yes.  Without this change, we hit the lscpu error in the commit message,
>>> and get zero output about the system.  We don't even get information
>>> about the caches which are architecturally specified or how many cpus
>>> are present.  With this change, we get what we expect out of lscpu (and
>>> also lstopo) including the cache(s) which are not architecturally
>>> specified.
>>>
>> lscpu and lstopo are so broken. They just assume everything on CPU0.
>> If you hotplug them out, you start seeing issues. So reading and file
>> that doesn't exist and then bail out on other essential info though they
>> are present, hmmm ...
> 
> Can you elaborate?
> 
> I am not sure cpu0 is supposed to be offlineable on Linux. There's no
> "online" file in /sys/devices/system/cpu/cpu0. That's why former lstopo
> doesn't like CPU0 being hotplugged out. We are actually making that case
> work for another non-standard corner case. But offlining "cpu0" this is
> considered "normal", somebody must add that missing "online" sysfs
> attribute for "cpu0" (change
> https://elixir.bootlin.com/linux/latest/source/drivers/base/cpu.c#L375).

On x86 you can't normally offline CPU0, its something to do with certain
interrupts always being routed to CPU0, (oh, and hibernate).
You should be able to enable this behaviour with 'cpu0_hotplug' on the kernel
command line.

(Kconfig's CONFIG_BOOTPARAM_HOTPLUG_CPU0 and CONFIG_DEBUG_HOTPLUG_CPU0 are also
worth a look)


On arm64 at least, cpu0 is just like the others, and can be offlined.


Thanks,

James


> By the way, did anybody actually see an error with lstopo when there's
> no "type" attribute for L3? I can't reproduce any issue, we just skip
> that specific cache entirely, but everything else appears. If you guys
> want to make that "no_cache" cache appear, I'll make it a Unified cache
> unless you tell me what to show :)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] ACPI/PPTT: Handle architecturally unknown cache types
  2018-09-13  9:39         ` James Morse
@ 2018-09-13 10:35           ` Sudeep Holla
  2018-09-13 11:53             ` Brice Goglin
  0 siblings, 1 reply; 22+ messages in thread
From: Sudeep Holla @ 2018-09-13 10:35 UTC (permalink / raw)
  To: James Morse, Brice Goglin
  Cc: Jeffrey Hugo, Sudeep Holla, Jeremy Linton, rjw, linux-acpi,
	linux-kernel, vkilari

On Thu, Sep 13, 2018 at 10:39:10AM +0100, James Morse wrote:
> Hi Brice,
>
> On 13/09/18 06:51, Brice Goglin wrote:
> > Le 12/09/2018 à 11:49, Sudeep Holla a écrit :
> >>> Yes.  Without this change, we hit the lscpu error in the commit message,
> >>> and get zero output about the system.  We don't even get information
> >>> about the caches which are architecturally specified or how many cpus
> >>> are present.  With this change, we get what we expect out of lscpu (and
> >>> also lstopo) including the cache(s) which are not architecturally
> >>> specified.
> >>>
> >> lscpu and lstopo are so broken. They just assume everything on CPU0.
> >> If you hotplug them out, you start seeing issues. So reading and file
> >> that doesn't exist and then bail out on other essential info though they
> >> are present, hmmm ...
> >
> > Can you elaborate?
> >
> > I am not sure cpu0 is supposed to be offlineable on Linux. There's no
> > "online" file in /sys/devices/system/cpu/cpu0. That's why former lstopo
> > doesn't like CPU0 being hotplugged out. We are actually making that case
> > work for another non-standard corner case. But offlining "cpu0" this is
> > considered "normal", somebody must add that missing "online" sysfs
> > attribute for "cpu0" (change
> > https://elixir.bootlin.com/linux/latest/source/drivers/base/cpu.c#L375).
>
> On x86 you can't normally offline CPU0, its something to do with certain
> interrupts always being routed to CPU0, (oh, and hibernate).
> You should be able to enable this behaviour with 'cpu0_hotplug' on the kernel
> command line.
>
> (Kconfig's CONFIG_BOOTPARAM_HOTPLUG_CPU0 and CONFIG_DEBUG_HOTPLUG_CPU0 are also
> worth a look)
>
> On arm64 at least, cpu0 is just like the others, and can be offlined.
>

Thanks James, for providing all the details.

To add to the issues I spotted with lscpu/lstopo around topology, it ignores
the updates to topology sibling masks when CPUs are hotplugged in and out.

We have following in lscpu:
	add_summary_n(tb, _("Core(s) per socket:"),
			cores_per_socket ?: desc->ncores / desc->nsockets);

Now when cores_per_socket = 1, (i.e when we don't have procfs entry),
if ncores = (ncores_max - few_cpus_hotplugged_out), core(s) per socket
will get computed as less than the actual number.

IMO lscpu should be used only when all CPUs are online and it should have
a warning when all cores are not online.

> > By the way, did anybody actually see an error with lstopo when there's
> > no "type" attribute for L3? I can't reproduce any issue, we just skip
> > that specific cache entirely, but everything else appears. If you guys
> > want to make that "no_cache" cache appear, I'll make it a Unified cache
> > unless you tell me what to show :)

IIUC, Jeffrey Hugo did see error as per his initial message:
"
This fixes the following lscpu issue where only the cache type sysfs file
is missing which results in no output providing a poor user experience in
the above system configuration.
lscpu: cannot open /sys/devices/system/cpu/cpu0/cache/index3/type: No such
file or directory
"

--
Regards,
Sudeep

[1] https://www.spinics.net/lists/arm-kernel/msg661101.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] ACPI/PPTT: Handle architecturally unknown cache types
  2018-09-13 10:35           ` Sudeep Holla
@ 2018-09-13 11:53             ` Brice Goglin
  2018-09-13 15:10               ` Jeffrey Hugo
  2018-09-13 15:16               ` Sudeep Holla
  0 siblings, 2 replies; 22+ messages in thread
From: Brice Goglin @ 2018-09-13 11:53 UTC (permalink / raw)
  To: Sudeep Holla, James Morse
  Cc: Jeffrey Hugo, Jeremy Linton, rjw, linux-acpi, linux-kernel, vkilari

Le 13/09/2018 à 11:35, Sudeep Holla a écrit :
> On Thu, Sep 13, 2018 at 10:39:10AM +0100, James Morse wrote:
>> Hi Brice,
>>
>> On 13/09/18 06:51, Brice Goglin wrote:
>>> Le 12/09/2018 à 11:49, Sudeep Holla a écrit :
>>>>> Yes.  Without this change, we hit the lscpu error in the commit message,
>>>>> and get zero output about the system.  We don't even get information
>>>>> about the caches which are architecturally specified or how many cpus
>>>>> are present.  With this change, we get what we expect out of lscpu (and
>>>>> also lstopo) including the cache(s) which are not architecturally
>>>>> specified.
>>>>>
>>>> lscpu and lstopo are so broken. They just assume everything on CPU0.
>>>> If you hotplug them out, you start seeing issues. So reading and file
>>>> that doesn't exist and then bail out on other essential info though they
>>>> are present, hmmm ...
>>> Can you elaborate?
>>>
>>> I am not sure cpu0 is supposed to be offlineable on Linux. There's no
>>> "online" file in /sys/devices/system/cpu/cpu0. That's why former lstopo
>>> doesn't like CPU0 being hotplugged out. We are actually making that case
>>> work for another non-standard corner case. But offlining "cpu0" this is
>>> considered "normal", somebody must add that missing "online" sysfs
>>> attribute for "cpu0" (change
>>> https://elixir.bootlin.com/linux/latest/source/drivers/base/cpu.c#L375).
>> On x86 you can't normally offline CPU0, its something to do with certain
>> interrupts always being routed to CPU0, (oh, and hibernate).
>> You should be able to enable this behaviour with 'cpu0_hotplug' on the kernel
>> command line.
>>
>> (Kconfig's CONFIG_BOOTPARAM_HOTPLUG_CPU0 and CONFIG_DEBUG_HOTPLUG_CPU0 are also
>> worth a look)
>>
>> On arm64 at least, cpu0 is just like the others, and can be offlined.
>>
> Thanks James, for providing all the details.
>
> To add to the issues I spotted with lscpu/lstopo around topology, it ignores
> the updates to topology sibling masks when CPUs are hotplugged in and out.
>
> We have following in lscpu:
> 	add_summary_n(tb, _("Core(s) per socket:"),
> 			cores_per_socket ?: desc->ncores / desc->nsockets);
>
> Now when cores_per_socket = 1, (i.e when we don't have procfs entry),
> if ncores = (ncores_max - few_cpus_hotplugged_out), core(s) per socket
> will get computed as less than the actual number.
>
> IMO lscpu should be used only when all CPUs are online and it should have
> a warning when all cores are not online.
>
>>> By the way, did anybody actually see an error with lstopo when there's
>>> no "type" attribute for L3? I can't reproduce any issue, we just skip
>>> that specific cache entirely, but everything else appears. If you guys
>>> want to make that "no_cache" cache appear, I'll make it a Unified cache
>>> unless you tell me what to show :)
> IIUC, Jeffrey Hugo did see error as per his initial message:
> "
> This fixes the following lscpu issue where only the cache type sysfs file
> is missing which results in no output providing a poor user experience in
> the above system configuration.
> lscpu: cannot open /sys/devices/system/cpu/cpu0/cache/index3/type: No such
> file or directory
> "
>

I don't know about lscpu (it's a different project), but lstopo
shouldn't have any such problem.

If you see an issue with lstopo, I'd be interesting in getting the
tarball generated by hwloc-gather-topology (it dumps useful files from
procfs and sysfs so that we may debug offline).

Brice


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] ACPI/PPTT: Handle architecturally unknown cache types
  2018-09-13 11:53             ` Brice Goglin
@ 2018-09-13 15:10               ` Jeffrey Hugo
  2018-09-13 15:16               ` Sudeep Holla
  1 sibling, 0 replies; 22+ messages in thread
From: Jeffrey Hugo @ 2018-09-13 15:10 UTC (permalink / raw)
  To: Brice Goglin, Sudeep Holla, James Morse
  Cc: Jeremy Linton, rjw, linux-acpi, linux-kernel, vkilari

On 9/13/2018 5:53 AM, Brice Goglin wrote:
> Le 13/09/2018 à 11:35, Sudeep Holla a écrit :
>> On Thu, Sep 13, 2018 at 10:39:10AM +0100, James Morse wrote:
>>> Hi Brice,
>>>
>>> On 13/09/18 06:51, Brice Goglin wrote:
>>>> Le 12/09/2018 à 11:49, Sudeep Holla a écrit :
>>>>>> Yes.  Without this change, we hit the lscpu error in the commit message,
>>>>>> and get zero output about the system.  We don't even get information
>>>>>> about the caches which are architecturally specified or how many cpus
>>>>>> are present.  With this change, we get what we expect out of lscpu (and
>>>>>> also lstopo) including the cache(s) which are not architecturally
>>>>>> specified.
>>>>>>
>>>>> lscpu and lstopo are so broken. They just assume everything on CPU0.
>>>>> If you hotplug them out, you start seeing issues. So reading and file
>>>>> that doesn't exist and then bail out on other essential info though they
>>>>> are present, hmmm ...
>>>> Can you elaborate?
>>>>
>>>> I am not sure cpu0 is supposed to be offlineable on Linux. There's no
>>>> "online" file in /sys/devices/system/cpu/cpu0. That's why former lstopo
>>>> doesn't like CPU0 being hotplugged out. We are actually making that case
>>>> work for another non-standard corner case. But offlining "cpu0" this is
>>>> considered "normal", somebody must add that missing "online" sysfs
>>>> attribute for "cpu0" (change
>>>> https://elixir.bootlin.com/linux/latest/source/drivers/base/cpu.c#L375).
>>> On x86 you can't normally offline CPU0, its something to do with certain
>>> interrupts always being routed to CPU0, (oh, and hibernate).
>>> You should be able to enable this behaviour with 'cpu0_hotplug' on the kernel
>>> command line.
>>>
>>> (Kconfig's CONFIG_BOOTPARAM_HOTPLUG_CPU0 and CONFIG_DEBUG_HOTPLUG_CPU0 are also
>>> worth a look)
>>>
>>> On arm64 at least, cpu0 is just like the others, and can be offlined.
>>>
>> Thanks James, for providing all the details.
>>
>> To add to the issues I spotted with lscpu/lstopo around topology, it ignores
>> the updates to topology sibling masks when CPUs are hotplugged in and out.
>>
>> We have following in lscpu:
>> 	add_summary_n(tb, _("Core(s) per socket:"),
>> 			cores_per_socket ?: desc->ncores / desc->nsockets);
>>
>> Now when cores_per_socket = 1, (i.e when we don't have procfs entry),
>> if ncores = (ncores_max - few_cpus_hotplugged_out), core(s) per socket
>> will get computed as less than the actual number.
>>
>> IMO lscpu should be used only when all CPUs are online and it should have
>> a warning when all cores are not online.
>>
>>>> By the way, did anybody actually see an error with lstopo when there's
>>>> no "type" attribute for L3? I can't reproduce any issue, we just skip
>>>> that specific cache entirely, but everything else appears. If you guys
>>>> want to make that "no_cache" cache appear, I'll make it a Unified cache
>>>> unless you tell me what to show :)
>> IIUC, Jeffrey Hugo did see error as per his initial message:
>> "
>> This fixes the following lscpu issue where only the cache type sysfs file
>> is missing which results in no output providing a poor user experience in
>> the above system configuration.
>> lscpu: cannot open /sys/devices/system/cpu/cpu0/cache/index3/type: No such
>> file or directory
>> "
>>
> 
> I don't know about lscpu (it's a different project), but lstopo
> shouldn't have any such problem.
> 
> If you see an issue with lstopo, I'd be interesting in getting the
> tarball generated by hwloc-gather-topology (it dumps useful files from
> procfs and sysfs so that we may debug offline).

No error was reported with lstopo, but we don't see the cache as 
expected.  Fixing the type results in the expected lstopo output.  This 
seems consistent with your expectations.

-- 
Jeffrey Hugo
Qualcomm Datacenter Technologies as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] ACPI/PPTT: Handle architecturally unknown cache types
  2018-09-13 11:53             ` Brice Goglin
  2018-09-13 15:10               ` Jeffrey Hugo
@ 2018-09-13 15:16               ` Sudeep Holla
  1 sibling, 0 replies; 22+ messages in thread
From: Sudeep Holla @ 2018-09-13 15:16 UTC (permalink / raw)
  To: Brice Goglin, James Morse
  Cc: Sudeep Holla, Jeffrey Hugo, Jeremy Linton, rjw, linux-acpi,
	linux-kernel, vkilari



On 13/09/18 12:53, Brice Goglin wrote:
> Le 13/09/2018 à 11:35, Sudeep Holla a écrit :


>>> On 13/09/18 06:51, Brice Goglin wrote:

[...]

>>>> By the way, did anybody actually see an error with lstopo when there's
>>>> no "type" attribute for L3? I can't reproduce any issue, we just skip
>>>> that specific cache entirely, but everything else appears. If you guys
>>>> want to make that "no_cache" cache appear, I'll make it a Unified cache
>>>> unless you tell me what to show :)
>> IIUC, Jeffrey Hugo did see error as per his initial message:
>> "
>> This fixes the following lscpu issue where only the cache type sysfs file
>> is missing which results in no output providing a poor user experience in
>> the above system configuration.
>> lscpu: cannot open /sys/devices/system/cpu/cpu0/cache/index3/type: No such
>> file or directory
>> "
>>
> 
> I don't know about lscpu (it's a different project), but lstopo
> shouldn't have any such problem.
> 

Ah OK, I have only looked at lscpu source. Sorry for that, I assumed
they are based on same/similar code base. Thanks for letting me know
they are not.

> If you see an issue with lstopo, I'd be interesting in getting the
> tarball generated by hwloc-gather-topology (it dumps useful files from
> procfs and sysfs so that we may debug offline).

Thanks for the information.

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2018-09-13 15:17 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-11 19:32 [PATCH] ACPI/PPTT: Handle architecturally unknown cache types Jeffrey Hugo
2018-09-11 20:16 ` Jeremy Linton
2018-09-11 20:38   ` Jeffrey Hugo
2018-09-11 21:25     ` Jeremy Linton
2018-09-12 14:41       ` Jeffrey Hugo
2018-09-12 15:27         ` Sudeep Holla
2018-09-12 15:38           ` Sudeep Holla
2018-09-12 15:57             ` Jeffrey Hugo
2018-09-12 16:15               ` Sudeep Holla
2018-09-12 16:25                 ` Jeffrey Hugo
2018-09-12 15:39         ` Jeremy Linton
2018-09-12 16:06           ` Jeffrey Hugo
2018-09-12 10:49     ` Sudeep Holla
2018-09-12 14:48       ` Jeffrey Hugo
2018-09-12 15:32         ` Sudeep Holla
2018-09-13  5:51       ` Brice Goglin
2018-09-13  9:39         ` James Morse
2018-09-13 10:35           ` Sudeep Holla
2018-09-13 11:53             ` Brice Goglin
2018-09-13 15:10               ` Jeffrey Hugo
2018-09-13 15:16               ` Sudeep Holla
2018-09-12 10:37   ` Sudeep Holla

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).