Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / Atom feed
* About add an A64FX cache control function into resctrl
@ 2021-04-09  5:46 tan.shaopeng
  2021-04-21  8:37 ` tan.shaopeng
  0 siblings, 1 reply; 8+ messages in thread
From: tan.shaopeng @ 2021-04-09  5:46 UTC (permalink / raw)
  To: 'fenghua.yu@intel.com', 'reinette.chatre@intel.com'
  Cc: 'linux-kernel@vger.kernel.org',
	'linux-arm-kernel@lists.infradead.org',
	'James Morse',
	misono.tomohiro

Hello


I'm Tan Shaopeng from Fujitsu Limited. 

I’m trying to implement Fujitsu A64FX’s cache related features. 
It is a cache partitioning function we called sector cache function 
that using the value of the tag that is upper 8 bits of the 64bit 
address and the value of the sector cache register to control virtual 
cache capacity of the L1D&L2 cache. 

A few days ago, when I sent a driver that realizes this function to 
ARM64 kernel community, Will Deacon and Arnd Bergmann suggested 
an idea to add the sector cache function of A64FX into resctrl. 
https://lore.kernel.org/linux-arm-kernel/CAK8P3a2pFcNTw9NpRtQfYr7A5OcZ=As2kM0D_sbfFcGQ_J2Q+Q@mail.gmail.com/ 

Based on my study, I think the sector cache function of A64FX can be 
added into the allocation features of resctrl after James' resctrl 
rework has finished. But, in order to implement this function, 
more interfaces for resctrl are need. The details are as follow, 
and could you give me some advice? 

[Sector cache function] 
The sector cache function split cache into multiple sectors and 
control them separately. It is implemented on the L1D cache and 
L2 cache in the A64FX processor and can be controlled individually 
for L1D cache and L2 cache. A64FX has no L3 cache. Each L1D cache 
and L2 cache has 4 sectors. Which L1D sector is used is specified 
by the value of [57:56] bits of address, how many ways of sector 
are specified by the value of register (IMP_SCCR_L1_EL0). 
Which L2 sector is used is specified by the value of [56] bits of 
address, and how many ways of sector are specified by value of register 
(IMP_SCCR_ASSIGN_EL1, IMP_SCCR_SET0_L2_EL1, IMP_SCCR_SET1_L2_EL1). 

For more details of sector cache function, 
see A64FX HPC extension specification (1.2. Sector cache) in 
https://github.com/fujitsu/A64FX 

[Difference between resctrl(CAT) and this sector cache function] 
L2/L3 CAT (Cache Allocation Technology) enables the user to specify 
some physical partition of cache space that an application can fill. 
A64FX's L1D/L2 cache has 4 sectors and 16ways. This sector function 
enables a user to specify number of ways each sector uses. 
Therefore, for CAT it is enough to specify a cache portion for 
each cache_id (socket). On the other hand, sector cache needs to 
specify cache portion of each sector for each cache_id, and following 
extension to resctrl interface is needed to support sector cache. 

[Idear for A64FX sector cache function control interface (schemata file details)] 
L1:<cache_id0>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;<cache_id1>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;…  
L2:<cache_id0>=>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;<cache_id1>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;… 

・L1: Add a new interface to control the L1D cache. 
・<cwbm>,<cwbm>,<cwbm>,<cwbm>:Specify the number of ways for each sector. 
・cwbm:Specify the number of ways in each sector as a bitmap (percentage), 
  but the bitmap does not indicate the location of the cache. 
* In the sector cache function, L2 sector cache way setting register is 
  shared among PEs (Processor Element) in shared domain. If two PEs 
  which share L2 cache belongs to different resource groups, one resource 
  group's L2 setting will affect to other resource group's L2 setting. 
* Since A64FX does not support MPAM, it is not necessary to consider 
  how to switch between MPAM and sector cache function now. 

Some questions: 
1.I'm still studying about RDT, could you tell me whether RDT has 
  the similar mechanism with sector cache function? 
2.In RDT, L3 cache is shared among cores in socket. If two cores which 
  share L3 cache belongs to different resource groups, one resource 
  group's L3 setting will affect to other resource group's L3 setting? 
3.Is this approach acceptable? could you give me some advice? 


Best regards 
Tan Shaopeng 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: About add an A64FX cache control function into resctrl
  2021-04-09  5:46 About add an A64FX cache control function into resctrl tan.shaopeng
@ 2021-04-21  8:37 ` tan.shaopeng
  2021-04-21 16:39   ` Reinette Chatre
  0 siblings, 1 reply; 8+ messages in thread
From: tan.shaopeng @ 2021-04-21  8:37 UTC (permalink / raw)
  To: 'fenghua.yu@intel.com', 'reinette.chatre@intel.com'
  Cc: 'linux-kernel@vger.kernel.org',
	'linux-arm-kernel@lists.infradead.org',
	'James Morse',
	misono.tomohiro

Hi,

Ping... any comments&advice about add an A64FX cache control function into resctrl?

Best regards
Tan Shaopeng

> Hello
> 
> 
> I'm Tan Shaopeng from Fujitsu Limited.
> 
> I’m trying to implement Fujitsu A64FX’s cache related features.
> It is a cache partitioning function we called sector cache function that using
> the value of the tag that is upper 8 bits of the 64bit address and the value of the
> sector cache register to control virtual cache capacity of the L1D&L2 cache.
> 
> A few days ago, when I sent a driver that realizes this function to
> ARM64 kernel community, Will Deacon and Arnd Bergmann suggested an idea
> to add the sector cache function of A64FX into resctrl.
> https://lore.kernel.org/linux-arm-kernel/CAK8P3a2pFcNTw9NpRtQfYr7A5Oc
> Z=As2kM0D_sbfFcGQ_J2Q+Q@mail.gmail.com/
> 
> Based on my study, I think the sector cache function of A64FX can be added
> into the allocation features of resctrl after James' resctrl rework has finished.
> But, in order to implement this function, more interfaces for resctrl are need.
> The details are as follow, and could you give me some advice?
> 
> [Sector cache function]
> The sector cache function split cache into multiple sectors and control them
> separately. It is implemented on the L1D cache and
> L2 cache in the A64FX processor and can be controlled individually for L1D
> cache and L2 cache. A64FX has no L3 cache. Each L1D cache and L2 cache
> has 4 sectors. Which L1D sector is used is specified by the value of [57:56] bits
> of address, how many ways of sector are specified by the value of register
> (IMP_SCCR_L1_EL0).
> Which L2 sector is used is specified by the value of [56] bits of address, and
> how many ways of sector are specified by value of register
> (IMP_SCCR_ASSIGN_EL1, IMP_SCCR_SET0_L2_EL1,
> IMP_SCCR_SET1_L2_EL1).
> 
> For more details of sector cache function, see A64FX HPC extension
> specification (1.2. Sector cache) in https://github.com/fujitsu/A64FX
> 
> [Difference between resctrl(CAT) and this sector cache function]
> L2/L3 CAT (Cache Allocation Technology) enables the user to specify some
> physical partition of cache space that an application can fill.
> A64FX's L1D/L2 cache has 4 sectors and 16ways. This sector function enables
> a user to specify number of ways each sector uses.
> Therefore, for CAT it is enough to specify a cache portion for each cache_id
> (socket). On the other hand, sector cache needs to specify cache portion of
> each sector for each cache_id, and following extension to resctrl interface is
> needed to support sector cache.
> 
> [Idear for A64FX sector cache function control interface (schemata file
> details)]
> L1:<cache_id0>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;<cache_id1>=<cw
> bm>,<cwbm>,<cwbm>,<cwbm>;…
> L2:<cache_id0>=>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;<cache_id1>=
> <cwbm>,<cwbm>,<cwbm>,<cwbm>;…
> 
> ・L1: Add a new interface to control the L1D cache.
> ・<cwbm>,<cwbm>,<cwbm>,<cwbm>:Specify the number of ways for each
> sector.
> ・cwbm:Specify the number of ways in each sector as a bitmap (percentage),
>   but the bitmap does not indicate the location of the cache.
> * In the sector cache function, L2 sector cache way setting register is
>   shared among PEs (Processor Element) in shared domain. If two PEs
>   which share L2 cache belongs to different resource groups, one resource
>   group's L2 setting will affect to other resource group's L2 setting.
> * Since A64FX does not support MPAM, it is not necessary to consider
>   how to switch between MPAM and sector cache function now.
> 
> Some questions:
> 1.I'm still studying about RDT, could you tell me whether RDT has
>   the similar mechanism with sector cache function?
> 2.In RDT, L3 cache is shared among cores in socket. If two cores which
>   share L3 cache belongs to different resource groups, one resource
>   group's L3 setting will affect to other resource group's L3 setting?
> 3.Is this approach acceptable? could you give me some advice?
> 
> 
> Best regards
> Tan Shaopeng


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: About add an A64FX cache control function into resctrl
  2021-04-21  8:37 ` tan.shaopeng
@ 2021-04-21 16:39   ` Reinette Chatre
  2021-04-23  8:10     ` tan.shaopeng
  2021-04-28  8:16     ` tan.shaopeng
  0 siblings, 2 replies; 8+ messages in thread
From: Reinette Chatre @ 2021-04-21 16:39 UTC (permalink / raw)
  To: tan.shaopeng, 'fenghua.yu@intel.com'
  Cc: 'linux-kernel@vger.kernel.org',
	'linux-arm-kernel@lists.infradead.org',
	'James Morse',
	misono.tomohiro

Hi Tan Shaopeng,

On 4/21/2021 1:37 AM, tan.shaopeng@fujitsu.com wrote:
> Hi,
> 
> Ping... any comments&advice about add an A64FX cache control function into resctrl?

My apologies for the delay.

> 
> Best regards
> Tan Shaopeng
> 
>> Hello
>>
>>
>> I'm Tan Shaopeng from Fujitsu Limited.
>>
>> I’m trying to implement Fujitsu A64FX’s cache related features.
>> It is a cache partitioning function we called sector cache function that using
>> the value of the tag that is upper 8 bits of the 64bit address and the value of the
>> sector cache register to control virtual cache capacity of the L1D&L2 cache.
>>
>> A few days ago, when I sent a driver that realizes this function to
>> ARM64 kernel community, Will Deacon and Arnd Bergmann suggested an idea
>> to add the sector cache function of A64FX into resctrl.
>> https://lore.kernel.org/linux-arm-kernel/CAK8P3a2pFcNTw9NpRtQfYr7A5Oc
>> Z=As2kM0D_sbfFcGQ_J2Q+Q@mail.gmail.com/
>>
>> Based on my study, I think the sector cache function of A64FX can be added
>> into the allocation features of resctrl after James' resctrl rework has finished.
>> But, in order to implement this function, more interfaces for resctrl are need.
>> The details are as follow, and could you give me some advice?
>>
>> [Sector cache function]
>> The sector cache function split cache into multiple sectors and control them
>> separately. It is implemented on the L1D cache and
>> L2 cache in the A64FX processor and can be controlled individually for L1D
>> cache and L2 cache. A64FX has no L3 cache. Each L1D cache and L2 cache
>> has 4 sectors. Which L1D sector is used is specified by the value of [57:56] bits
>> of address, how many ways of sector are specified by the value of register
>> (IMP_SCCR_L1_EL0).
>> Which L2 sector is used is specified by the value of [56] bits of address, and
>> how many ways of sector are specified by value of register
>> (IMP_SCCR_ASSIGN_EL1, IMP_SCCR_SET0_L2_EL1,
>> IMP_SCCR_SET1_L2_EL1).
>>
>> For more details of sector cache function, see A64FX HPC extension
>> specification (1.2. Sector cache) in https://github.com/fujitsu/A64FX

The overview in section 12 was informative but very high level.
I was not able to find any instance of "IMP_SCCR" in this document to  
explore how this cache allocation works.

Are these cache sectors exposed to the OS in any way? For example, when  
the OS discovers the cache, does it learn about these sectors and expose  
the details to user space (/sys/devices/system/cpuX/cache)?

The overview of Sector Cache in that document provides details of how  
the size of the sector itself is dynamically adjusted to usage. That  
description is quite cryptic but it seems like a sector, since the  
number of ways associated with it can dynamically change, is more  
equivalent to a class of service or resource group in the resctrl  
environment.

I really may be interpreting things wrong here, could you perhaps point  
me to where I can obtain more details?


>> [Difference between resctrl(CAT) and this sector cache function]
>> L2/L3 CAT (Cache Allocation Technology) enables the user to specify some
>> physical partition of cache space that an application can fill.
>> A64FX's L1D/L2 cache has 4 sectors and 16ways. This sector function enables
>> a user to specify number of ways each sector uses.
>> Therefore, for CAT it is enough to specify a cache portion for each cache_id
>> (socket). On the other hand, sector cache needs to specify cache portion of
>> each sector for each cache_id, and following extension to resctrl interface is
>> needed to support sector cache.
>>
>> [Idear for A64FX sector cache function control interface (schemata file
>> details)]
>> L1:<cache_id0>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;<cache_id1>=<cw
>> bm>,<cwbm>,<cwbm>,<cwbm>;…
>> L2:<cache_id0>=>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;<cache_id1>=
>> <cwbm>,<cwbm>,<cwbm>,<cwbm>;…
>>
>> ・L1: Add a new interface to control the L1D cache.
>> ・<cwbm>,<cwbm>,<cwbm>,<cwbm>:Specify the number of ways for each
>> sector.
>> ・cwbm:Specify the number of ways in each sector as a bitmap (percentage),
>>    but the bitmap does not indicate the location of the cache.
>> * In the sector cache function, L2 sector cache way setting register is
>>    shared among PEs (Processor Element) in shared domain. If two PEs
>>    which share L2 cache belongs to different resource groups, one resource
>>    group's L2 setting will affect to other resource group's L2 setting.

In resctrl a "resource group" can be viewed as a class of service.

>> * Since A64FX does not support MPAM, it is not necessary to consider
>>    how to switch between MPAM and sector cache function now.
>>
>> Some questions:
>> 1.I'm still studying about RDT, could you tell me whether RDT has
>>    the similar mechanism with sector cache function?

This is not clear to me yet. One thing to keep in mind is that a bit in  
the capacity bitmask could correspond to some number of ways in a cache,  
but it does not have to. It is essentially a hint to hardware on how  
much cache space needs to be allocated while also indicating overlap and  
isolation from other allocations.

resctrl already supports the bitmask being interpreted differently  
between architectures and with the MPAM support there will be even more  
support for different interpretations.

>> 2.In RDT, L3 cache is shared among cores in socket. If two cores which
>>    share L3 cache belongs to different resource groups, one resource
>>    group's L3 setting will affect to other resource group's L3 setting?

This question is not entirely clear to me. Are you referring to the  
hardware layout or configuration changes via the resctrl "cpus" file?

Each resource group is a class of service (CLOS) that is supported by  
all cache instances. By default each resource group would thus contain  
all cache instances on the system (even if some cache instances do not  
support the same number of CLOS resctrl would only support the CLOS  
supported by all resources).

Reinette

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: About add an A64FX cache control function into resctrl
  2021-04-21 16:39   ` Reinette Chatre
@ 2021-04-23  8:10     ` tan.shaopeng
  2021-04-28  8:16     ` tan.shaopeng
  1 sibling, 0 replies; 8+ messages in thread
From: tan.shaopeng @ 2021-04-23  8:10 UTC (permalink / raw)
  To: 'Reinette Chatre', 'fenghua.yu@intel.com'
  Cc: 'linux-kernel@vger.kernel.org',
	'linux-arm-kernel@lists.infradead.org',
	'James Morse',
	misono.tomohiro

Hi Reinette,

> On 4/21/2021 1:37 AM, tan.shaopeng@fujitsu.com wrote:
> > Hi,
> >
> > Ping... any comments&advice about add an A64FX cache control function
> into resctrl?
> 
> My apologies for the delay.
> 
> >
> > Best regards
> > Tan Shaopeng
> >
> >> Hello
> >>
> >>
> >> I'm Tan Shaopeng from Fujitsu Limited.
> >>
> >> I’m trying to implement Fujitsu A64FX’s cache related features.
> >> It is a cache partitioning function we called sector cache function
> >> that using the value of the tag that is upper 8 bits of the 64bit
> >> address and the value of the sector cache register to control virtual cache
> capacity of the L1D&L2 cache.
> >>
> >> A few days ago, when I sent a driver that realizes this function to
> >> ARM64 kernel community, Will Deacon and Arnd Bergmann suggested an
> >> idea to add the sector cache function of A64FX into resctrl.
> >>
> https://lore.kernel.org/linux-arm-kernel/CAK8P3a2pFcNTw9NpRtQfYr7A5Oc
> >> Z=As2kM0D_sbfFcGQ_J2Q+Q@mail.gmail.com/
> >>
> >> Based on my study, I think the sector cache function of A64FX can be
> >> added into the allocation features of resctrl after James' resctrl rework has
> finished.
> >> But, in order to implement this function, more interfaces for resctrl are
> need.
> >> The details are as follow, and could you give me some advice?
> >>
> >> [Sector cache function]
> >> The sector cache function split cache into multiple sectors and
> >> control them separately. It is implemented on the L1D cache and
> >> L2 cache in the A64FX processor and can be controlled individually
> >> for L1D cache and L2 cache. A64FX has no L3 cache. Each L1D cache and
> >> L2 cache has 4 sectors. Which L1D sector is used is specified by the
> >> value of [57:56] bits of address, how many ways of sector are
> >> specified by the value of register (IMP_SCCR_L1_EL0).
> >> Which L2 sector is used is specified by the value of [56] bits of
> >> address, and how many ways of sector are specified by value of
> >> register (IMP_SCCR_ASSIGN_EL1, IMP_SCCR_SET0_L2_EL1,
> >> IMP_SCCR_SET1_L2_EL1).
> >>
> >> For more details of sector cache function, see A64FX HPC extension
> >> specification (1.2. Sector cache) in https://github.com/fujitsu/A64FX
> 
> The overview in section 12 was informative but very high level.
> I was not able to find any instance of "IMP_SCCR" in this document to explore
> how this cache allocation works.
> 
> Are these cache sectors exposed to the OS in any way? For example, when the
> OS discovers the cache, does it learn about these sectors and expose the
> details to user space (/sys/devices/system/cpuX/cache)?
> 
> The overview of Sector Cache in that document provides details of how the size
> of the sector itself is dynamically adjusted to usage. That description is quite
> cryptic but it seems like a sector, since the number of ways associated with it
> can dynamically change, is more equivalent to a class of service or resource
> group in the resctrl environment.
> 
> I really may be interpreting things wrong here, could you perhaps point me to
> where I can obtain more details?
> 
> 
> >> [Difference between resctrl(CAT) and this sector cache function]
> >> L2/L3 CAT (Cache Allocation Technology) enables the user to specify
> >> some physical partition of cache space that an application can fill.
> >> A64FX's L1D/L2 cache has 4 sectors and 16ways. This sector function
> >> enables a user to specify number of ways each sector uses.
> >> Therefore, for CAT it is enough to specify a cache portion for each
> >> cache_id (socket). On the other hand, sector cache needs to specify
> >> cache portion of each sector for each cache_id, and following
> >> extension to resctrl interface is needed to support sector cache.
> >>
> >> [Idear for A64FX sector cache function control interface (schemata
> >> file details)]
> >>
> L1:<cache_id0>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;<cache_id1>=<cw
> >> bm>,<cwbm>,<cwbm>,<cwbm>;…
> >>
> L2:<cache_id0>=>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;<cache_id1>=
> >> <cwbm>,<cwbm>,<cwbm>,<cwbm>;…
> >>
> >> ・L1: Add a new interface to control the L1D cache.
> >> ・<cwbm>,<cwbm>,<cwbm>,<cwbm>:Specify the number of ways for
> each
> >> sector.
> >> ・cwbm:Specify the number of ways in each sector as a bitmap
> (percentage),
> >>    but the bitmap does not indicate the location of the cache.
> >> * In the sector cache function, L2 sector cache way setting register is
> >>    shared among PEs (Processor Element) in shared domain. If two PEs
> >>    which share L2 cache belongs to different resource groups, one
> resource
> >>    group's L2 setting will affect to other resource group's L2 setting.
> 
> In resctrl a "resource group" can be viewed as a class of service.
> 
> >> * Since A64FX does not support MPAM, it is not necessary to consider
> >>    how to switch between MPAM and sector cache function now.
> >>
> >> Some questions:
> >> 1.I'm still studying about RDT, could you tell me whether RDT has
> >>    the similar mechanism with sector cache function?
> 
> This is not clear to me yet. One thing to keep in mind is that a bit in the capacity
> bitmask could correspond to some number of ways in a cache, but it does not
> have to. It is essentially a hint to hardware on how much cache space needs to
> be allocated while also indicating overlap and isolation from other allocations.
> 
> resctrl already supports the bitmask being interpreted differently between
> architectures and with the MPAM support there will be even more support for
> different interpretations.
> 
> >> 2.In RDT, L3 cache is shared among cores in socket. If two cores which
> >>    share L3 cache belongs to different resource groups, one resource
> >>    group's L3 setting will affect to other resource group's L3 setting?
> 
> This question is not entirely clear to me. Are you referring to the hardware layout
> or configuration changes via the resctrl "cpus" file?
> 
> Each resource group is a class of service (CLOS) that is supported by all cache
> instances. By default each resource group would thus contain all cache
> instances on the system (even if some cache instances do not support the
> same number of CLOS resctrl would only support the CLOS supported by all
> resources).

Thanks for your comment. 

I am sorry that the description about the sector cache function was
difficult to understand. Since all public specifications were shown
in the URL, please give me some time, I will organize the contents of
64FX cache control function. 

Best regards, 
Tan Shaopeng

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: About add an A64FX cache control function into resctrl
  2021-04-21 16:39   ` Reinette Chatre
  2021-04-23  8:10     ` tan.shaopeng
@ 2021-04-28  8:16     ` tan.shaopeng
  2021-04-29 17:42       ` Reinette Chatre
  1 sibling, 1 reply; 8+ messages in thread
From: tan.shaopeng @ 2021-04-28  8:16 UTC (permalink / raw)
  To: 'Reinette Chatre', 'fenghua.yu@intel.com'
  Cc: 'linux-kernel@vger.kernel.org',
	'linux-arm-kernel@lists.infradead.org',
	'James Morse',
	misono.tomohiro

Hi Reinette,

> On 4/21/2021 1:37 AM, tan.shaopeng@fujitsu.com wrote:
> > Hi,
> >
> > Ping... any comments&advice about add an A64FX cache control function
> into resctrl?
> 
> My apologies for the delay.
> 
> >
> > Best regards
> > Tan Shaopeng
> >
> >> Hello
> >>
> >>
> >> I'm Tan Shaopeng from Fujitsu Limited.
> >>
> >> I’m trying to implement Fujitsu A64FX’s cache related features.
> >> It is a cache partitioning function we called sector cache function
> >> that using the value of the tag that is upper 8 bits of the 64bit
> >> address and the value of the sector cache register to control virtual cache
> capacity of the L1D&L2 cache.
> >>
> >> A few days ago, when I sent a driver that realizes this function to
> >> ARM64 kernel community, Will Deacon and Arnd Bergmann suggested an
> >> idea to add the sector cache function of A64FX into resctrl.
> >>
> https://lore.kernel.org/linux-arm-kernel/CAK8P3a2pFcNTw9NpRtQfYr7A5Oc
> >> Z=As2kM0D_sbfFcGQ_J2Q+Q@mail.gmail.com/
> >>
> >> Based on my study, I think the sector cache function of A64FX can be
> >> added into the allocation features of resctrl after James' resctrl rework has
> finished.
> >> But, in order to implement this function, more interfaces for resctrl are
> need.
> >> The details are as follow, and could you give me some advice?
> >>
> >> [Sector cache function]
> >> The sector cache function split cache into multiple sectors and
> >> control them separately. It is implemented on the L1D cache and
> >> L2 cache in the A64FX processor and can be controlled individually
> >> for L1D cache and L2 cache. A64FX has no L3 cache. Each L1D cache and
> >> L2 cache has 4 sectors. Which L1D sector is used is specified by the
> >> value of [57:56] bits of address, how many ways of sector are
> >> specified by the value of register (IMP_SCCR_L1_EL0).
> >> Which L2 sector is used is specified by the value of [56] bits of
> >> address, and how many ways of sector are specified by value of
> >> register (IMP_SCCR_ASSIGN_EL1, IMP_SCCR_SET0_L2_EL1,
> >> IMP_SCCR_SET1_L2_EL1).
> >>
> >> For more details of sector cache function, see A64FX HPC extension
> >> specification (1.2. Sector cache) in https://github.com/fujitsu/A64FX
> 
> The overview in section 12 was informative but very high level.

I'm considering how to answer your questions from your email which 
I received before, when I check the email again, I am sorry that 
the information I provided before are insufficient.  

To understand the sector cache function of A64FX, could you please see  
A64FX_Microarchitecture_Manual - section 12. Sector Cache 
https://github.com/fujitsu/A64FX/blob/master/doc/A64FX_Microarchitecture_Manual_en_1.4.pdf  
and, 
A64FX_Specification_HPC_Extension ? section 1.2. Sector Cache  
https://github.com/fujitsu/A64FX/blob/master/doc/A64FX_Specification_HPC_Extension_v1_EN.pdf  

In addition, Japan will be on a long holiday about one week from 
April 29th, I will answer your other questions after the holidays.  

> I was not able to find any instance of "IMP_SCCR" in this document to explore
> how this cache allocation works.
> 
> Are these cache sectors exposed to the OS in any way? For example, when the
> OS discovers the cache, does it learn about these sectors and expose the
> details to user space (/sys/devices/system/cpuX/cache)?
> 
> The overview of Sector Cache in that document provides details of how the size
> of the sector itself is dynamically adjusted to usage. That description is quite
> cryptic but it seems like a sector, since the number of ways associated with it
> can dynamically change, is more equivalent to a class of service or resource
> group in the resctrl environment.
> 
> I really may be interpreting things wrong here, could you perhaps point me to
> where I can obtain more details?
> 
> 
> >> [Difference between resctrl(CAT) and this sector cache function]
> >> L2/L3 CAT (Cache Allocation Technology) enables the user to specify
> >> some physical partition of cache space that an application can fill.
> >> A64FX's L1D/L2 cache has 4 sectors and 16ways. This sector function
> >> enables a user to specify number of ways each sector uses.
> >> Therefore, for CAT it is enough to specify a cache portion for each
> >> cache_id (socket). On the other hand, sector cache needs to specify
> >> cache portion of each sector for each cache_id, and following
> >> extension to resctrl interface is needed to support sector cache.
> >>
> >> [Idear for A64FX sector cache function control interface (schemata
> >> file details)]
> >>
> L1:<cache_id0>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;<cache_id1>=<cw
> >> bm>,<cwbm>,<cwbm>,<cwbm>;…
> >>
> L2:<cache_id0>=>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;<cache_id1>=
> >> <cwbm>,<cwbm>,<cwbm>,<cwbm>;…
> >>
> >> ・L1: Add a new interface to control the L1D cache.
> >> ・<cwbm>,<cwbm>,<cwbm>,<cwbm>:Specify the number of ways for
> each
> >> sector.
> >> ・cwbm:Specify the number of ways in each sector as a bitmap
> (percentage),
> >>    but the bitmap does not indicate the location of the cache.
> >> * In the sector cache function, L2 sector cache way setting register is
> >>    shared among PEs (Processor Element) in shared domain. If two PEs
> >>    which share L2 cache belongs to different resource groups, one
> resource
> >>    group's L2 setting will affect to other resource group's L2 setting.
> 
> In resctrl a "resource group" can be viewed as a class of service.
> 
> >> * Since A64FX does not support MPAM, it is not necessary to consider
> >>    how to switch between MPAM and sector cache function now.
> >>
> >> Some questions:
> >> 1.I'm still studying about RDT, could you tell me whether RDT has
> >>    the similar mechanism with sector cache function?
> 
> This is not clear to me yet. One thing to keep in mind is that a bit in the capacity
> bitmask could correspond to some number of ways in a cache, but it does not
> have to. It is essentially a hint to hardware on how much cache space needs to
> be allocated while also indicating overlap and isolation from other allocations.
> 
> resctrl already supports the bitmask being interpreted differently between
> architectures and with the MPAM support there will be even more support for
> different interpretations.
> 
> >> 2.In RDT, L3 cache is shared among cores in socket. If two cores which
> >>    share L3 cache belongs to different resource groups, one resource
> >>    group's L3 setting will affect to other resource group's L3 setting?
> 
> This question is not entirely clear to me. Are you referring to the hardware layout
> or configuration changes via the resctrl "cpus" file?
> 
> Each resource group is a class of service (CLOS) that is supported by all cache
> instances. By default each resource group would thus contain all cache
> instances on the system (even if some cache instances do not support the
> same number of CLOS resctrl would only support the CLOS supported by all
> resources).

Best regards 
Tan Shaopeng

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: About add an A64FX cache control function into resctrl
  2021-04-28  8:16     ` tan.shaopeng
@ 2021-04-29 17:42       ` Reinette Chatre
  2021-04-29 17:50         ` Luck, Tony
  0 siblings, 1 reply; 8+ messages in thread
From: Reinette Chatre @ 2021-04-29 17:42 UTC (permalink / raw)
  To: tan.shaopeng, 'fenghua.yu@intel.com'
  Cc: 'linux-kernel@vger.kernel.org',
	'linux-arm-kernel@lists.infradead.org',
	'James Morse',
	misono.tomohiro, Luck, Tony

Hi Tan Shaopeng,

On 4/28/2021 1:16 AM, tan.shaopeng@fujitsu.com wrote:
> Hi Reinette,
> 
>> On 4/21/2021 1:37 AM, tan.shaopeng@fujitsu.com wrote:
>>> Hi,
>>>
>>> Ping... any comments&advice about add an A64FX cache control function
>> into resctrl?
>>
>> My apologies for the delay.
>>
>>>
>>> Best regards
>>> Tan Shaopeng
>>>
>>>> Hello
>>>>
>>>>
>>>> I'm Tan Shaopeng from Fujitsu Limited.
>>>>
>>>> I’m trying to implement Fujitsu A64FX’s cache related features.
>>>> It is a cache partitioning function we called sector cache function
>>>> that using the value of the tag that is upper 8 bits of the 64bit
>>>> address and the value of the sector cache register to control virtual cache
>> capacity of the L1D&L2 cache.
>>>>
>>>> A few days ago, when I sent a driver that realizes this function to
>>>> ARM64 kernel community, Will Deacon and Arnd Bergmann suggested an
>>>> idea to add the sector cache function of A64FX into resctrl.
>>>>
>> https://lore.kernel.org/linux-arm-kernel/CAK8P3a2pFcNTw9NpRtQfYr7A5Oc
>>>> Z=As2kM0D_sbfFcGQ_J2Q+Q@mail.gmail.com/
>>>>
>>>> Based on my study, I think the sector cache function of A64FX can be
>>>> added into the allocation features of resctrl after James' resctrl rework has
>> finished.
>>>> But, in order to implement this function, more interfaces for resctrl are
>> need.
>>>> The details are as follow, and could you give me some advice?
>>>>
>>>> [Sector cache function]
>>>> The sector cache function split cache into multiple sectors and
>>>> control them separately. It is implemented on the L1D cache and
>>>> L2 cache in the A64FX processor and can be controlled individually
>>>> for L1D cache and L2 cache. A64FX has no L3 cache. Each L1D cache and
>>>> L2 cache has 4 sectors. Which L1D sector is used is specified by the
>>>> value of [57:56] bits of address, how many ways of sector are
>>>> specified by the value of register (IMP_SCCR_L1_EL0).
>>>> Which L2 sector is used is specified by the value of [56] bits of
>>>> address, and how many ways of sector are specified by value of
>>>> register (IMP_SCCR_ASSIGN_EL1, IMP_SCCR_SET0_L2_EL1,
>>>> IMP_SCCR_SET1_L2_EL1).
>>>>
>>>> For more details of sector cache function, see A64FX HPC extension
>>>> specification (1.2. Sector cache) in https://github.com/fujitsu/A64FX
>>
>> The overview in section 12 was informative but very high level.
> 
> I'm considering how to answer your questions from your email which
> I received before, when I check the email again, I am sorry that
> the information I provided before are insufficient.
> 
> To understand the sector cache function of A64FX, could you please see
> A64FX_Microarchitecture_Manual - section 12. Sector Cache
> https://github.com/fujitsu/A64FX/blob/master/doc/A64FX_Microarchitecture_Manual_en_1.4.pdf
> and,
> A64FX_Specification_HPC_Extension ? section 1.2. Sector Cache
> https://github.com/fujitsu/A64FX/blob/master/doc/A64FX_Specification_HPC_Extension_v1_EN.pdf

Thank you for the direct links - I missed that there are two documents  
available.

After reading the spec portion it does seem to me even more as though  
"sectors" could be considered the same as the resctrl "classes of  
service". The Fujitsu hardware supports four sectors that can be  
configured with different number of ways using the registers you mention  
above. In resctrl this could be considered as hardware that supports  
four classes of service and each class of service can be allocated a  
different number of ways.

The other part is how hardware knows which sector is being used at any  
moment in time. In resctrl that is programmed by writing the active  
class of service into needed register at the time the application is  
context switched (resctrl_sched_in()). This seems different here since  
as you describe the sector is chosen by bits in the address. Even so,  
which bits to set in the address needs to be programmed also and I also  
understand that there is a "default" sector that can be programmed via  
register. Could these be equivalent to what is done currently in resctrl?

(Could you please also consider my original questions?)

> 
> In addition, Japan will be on a long holiday about one week from
> April 29th, I will answer your other questions after the holidays.
> 
>> I was not able to find any instance of "IMP_SCCR" in this document to explore
>> how this cache allocation works.
>>
>> Are these cache sectors exposed to the OS in any way? For example, when the
>> OS discovers the cache, does it learn about these sectors and expose the
>> details to user space (/sys/devices/system/cpuX/cache)?
>>
>> The overview of Sector Cache in that document provides details of how the size
>> of the sector itself is dynamically adjusted to usage. That description is quite
>> cryptic but it seems like a sector, since the number of ways associated with it
>> can dynamically change, is more equivalent to a class of service or resource
>> group in the resctrl environment.
>>
>> I really may be interpreting things wrong here, could you perhaps point me to
>> where I can obtain more details?
>>
>>
>>>> [Difference between resctrl(CAT) and this sector cache function]
>>>> L2/L3 CAT (Cache Allocation Technology) enables the user to specify
>>>> some physical partition of cache space that an application can fill.
>>>> A64FX's L1D/L2 cache has 4 sectors and 16ways. This sector function
>>>> enables a user to specify number of ways each sector uses.
>>>> Therefore, for CAT it is enough to specify a cache portion for each
>>>> cache_id (socket). On the other hand, sector cache needs to specify
>>>> cache portion of each sector for each cache_id, and following
>>>> extension to resctrl interface is needed to support sector cache.
>>>>
>>>> [Idear for A64FX sector cache function control interface (schemata
>>>> file details)]
>>>>
>> L1:<cache_id0>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;<cache_id1>=<cw
>>>> bm>,<cwbm>,<cwbm>,<cwbm>;…
>>>>
>> L2:<cache_id0>=>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;<cache_id1>=
>>>> <cwbm>,<cwbm>,<cwbm>,<cwbm>;…
>>>>
>>>> ・L1: Add a new interface to control the L1D cache.
>>>> ・<cwbm>,<cwbm>,<cwbm>,<cwbm>:Specify the number of ways for
>> each
>>>> sector.
>>>> ・cwbm:Specify the number of ways in each sector as a bitmap
>> (percentage),
>>>>     but the bitmap does not indicate the location of the cache.
>>>> * In the sector cache function, L2 sector cache way setting register is
>>>>     shared among PEs (Processor Element) in shared domain. If two PEs
>>>>     which share L2 cache belongs to different resource groups, one
>> resource
>>>>     group's L2 setting will affect to other resource group's L2 setting.
>>
>> In resctrl a "resource group" can be viewed as a class of service.
>>
>>>> * Since A64FX does not support MPAM, it is not necessary to consider
>>>>     how to switch between MPAM and sector cache function now.
>>>>
>>>> Some questions:
>>>> 1.I'm still studying about RDT, could you tell me whether RDT has
>>>>     the similar mechanism with sector cache function?
>>
>> This is not clear to me yet. One thing to keep in mind is that a bit in the capacity
>> bitmask could correspond to some number of ways in a cache, but it does not
>> have to. It is essentially a hint to hardware on how much cache space needs to
>> be allocated while also indicating overlap and isolation from other allocations.
>>
>> resctrl already supports the bitmask being interpreted differently between
>> architectures and with the MPAM support there will be even more support for
>> different interpretations.
>>
>>>> 2.In RDT, L3 cache is shared among cores in socket. If two cores which
>>>>     share L3 cache belongs to different resource groups, one resource
>>>>     group's L3 setting will affect to other resource group's L3 setting?
>>
>> This question is not entirely clear to me. Are you referring to the hardware layout
>> or configuration changes via the resctrl "cpus" file?
>>
>> Each resource group is a class of service (CLOS) that is supported by all cache
>> instances. By default each resource group would thus contain all cache
>> instances on the system (even if some cache instances do not support the
>> same number of CLOS resctrl would only support the CLOS supported by all
>> resources).
> 

Reinette

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: About add an A64FX cache control function into resctrl
  2021-04-29 17:42       ` Reinette Chatre
@ 2021-04-29 17:50         ` Luck, Tony
  2021-04-30 11:46           ` Catalin Marinas
  0 siblings, 1 reply; 8+ messages in thread
From: Luck, Tony @ 2021-04-29 17:50 UTC (permalink / raw)
  To: Chatre, Reinette, tan.shaopeng, Yu, Fenghua
  Cc: 'linux-kernel@vger.kernel.org',
	'linux-arm-kernel@lists.infradead.org',
	'James Morse',
	misono.tomohiro

>>>> [Sector cache function]
>>>> The sector cache function split cache into multiple sectors and
>>>> control them separately. It is implemented on the L1D cache and
>>>> L2 cache in the A64FX processor and can be controlled individually
>>>> for L1D cache and L2 cache. A64FX has no L3 cache. Each L1D cache and
>>>> L2 cache has 4 sectors. Which L1D sector is used is specified by the
>>>> value of [57:56] bits of address, how many ways of sector are
>>>> specified by the value of register (IMP_SCCR_L1_EL0).
>>>> Which L2 sector is used is specified by the value of [56] bits of
>>>> address, and how many ways of sector are specified by value of
>>>> register (IMP_SCCR_ASSIGN_EL1, IMP_SCCR_SET0_L2_EL1,
>>>> IMP_SCCR_SET1_L2_EL1).

Are A64FX binaries position independent?  I.e. could the OS reassign
a running task to a different sector by remapping it to different virtual
addresses during a context switch?

Or is this a static property at task launch?

-Tony

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: About add an A64FX cache control function into resctrl
  2021-04-29 17:50         ` Luck, Tony
@ 2021-04-30 11:46           ` Catalin Marinas
  0 siblings, 0 replies; 8+ messages in thread
From: Catalin Marinas @ 2021-04-30 11:46 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Chatre, Reinette, tan.shaopeng, Yu, Fenghua,
	'linux-kernel@vger.kernel.org',
	'linux-arm-kernel@lists.infradead.org',
	'James Morse',
	misono.tomohiro

On Thu, Apr 29, 2021 at 05:50:20PM +0000, Luck, Tony wrote:
> >>>> [Sector cache function]
> >>>> The sector cache function split cache into multiple sectors and
> >>>> control them separately. It is implemented on the L1D cache and
> >>>> L2 cache in the A64FX processor and can be controlled individually
> >>>> for L1D cache and L2 cache. A64FX has no L3 cache. Each L1D cache and
> >>>> L2 cache has 4 sectors. Which L1D sector is used is specified by the
> >>>> value of [57:56] bits of address, how many ways of sector are
> >>>> specified by the value of register (IMP_SCCR_L1_EL0).
> >>>> Which L2 sector is used is specified by the value of [56] bits of
> >>>> address, and how many ways of sector are specified by value of
> >>>> register (IMP_SCCR_ASSIGN_EL1, IMP_SCCR_SET0_L2_EL1,
> >>>> IMP_SCCR_SET1_L2_EL1).
> 
> Are A64FX binaries position independent?  I.e. could the OS reassign
> a running task to a different sector by remapping it to different virtual
> addresses during a context switch?

Arm64 supports a maximum of 52-bit of virtual or physical addresses. The
maximum the MMU would produce would be a 52-bit output address. I
presume bits 56, 57 of the address bus are used for some cache affinity
(sector selection) but they don't influence the memory addressing, nor
could the MMU set them.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, back to index

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-09  5:46 About add an A64FX cache control function into resctrl tan.shaopeng
2021-04-21  8:37 ` tan.shaopeng
2021-04-21 16:39   ` Reinette Chatre
2021-04-23  8:10     ` tan.shaopeng
2021-04-28  8:16     ` tan.shaopeng
2021-04-29 17:42       ` Reinette Chatre
2021-04-29 17:50         ` Luck, Tony
2021-04-30 11:46           ` Catalin Marinas

Linux-ARM-Kernel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-arm-kernel/0 linux-arm-kernel/git/0.git
	git clone --mirror https://lore.kernel.org/linux-arm-kernel/1 linux-arm-kernel/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-arm-kernel linux-arm-kernel/ https://lore.kernel.org/linux-arm-kernel \
		linux-arm-kernel@lists.infradead.org
	public-inbox-index linux-arm-kernel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.infradead.lists.linux-arm-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git