linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* About add an A64FX cache control function into resctrl
@ 2021-04-09  5:46 tan.shaopeng
  2021-04-21  8:37 ` tan.shaopeng
  0 siblings, 1 reply; 20+ messages in thread
From: tan.shaopeng @ 2021-04-09  5:46 UTC (permalink / raw)
  To: 'fenghua.yu@intel.com', 'reinette.chatre@intel.com'
  Cc: 'linux-kernel@vger.kernel.org',
	'linux-arm-kernel@lists.infradead.org',
	'James Morse',
	misono.tomohiro

Hello


I'm Tan Shaopeng from Fujitsu Limited. 

I’m trying to implement Fujitsu A64FX’s cache related features. 
It is a cache partitioning function we called sector cache function 
that using the value of the tag that is upper 8 bits of the 64bit 
address and the value of the sector cache register to control virtual 
cache capacity of the L1D&L2 cache. 

A few days ago, when I sent a driver that realizes this function to 
ARM64 kernel community, Will Deacon and Arnd Bergmann suggested 
an idea to add the sector cache function of A64FX into resctrl. 
https://lore.kernel.org/linux-arm-kernel/CAK8P3a2pFcNTw9NpRtQfYr7A5OcZ=As2kM0D_sbfFcGQ_J2Q+Q@mail.gmail.com/ 

Based on my study, I think the sector cache function of A64FX can be 
added into the allocation features of resctrl after James' resctrl 
rework has finished. But, in order to implement this function, 
more interfaces for resctrl are need. The details are as follow, 
and could you give me some advice? 

[Sector cache function] 
The sector cache function split cache into multiple sectors and 
control them separately. It is implemented on the L1D cache and 
L2 cache in the A64FX processor and can be controlled individually 
for L1D cache and L2 cache. A64FX has no L3 cache. Each L1D cache 
and L2 cache has 4 sectors. Which L1D sector is used is specified 
by the value of [57:56] bits of address, how many ways of sector 
are specified by the value of register (IMP_SCCR_L1_EL0). 
Which L2 sector is used is specified by the value of [56] bits of 
address, and how many ways of sector are specified by value of register 
(IMP_SCCR_ASSIGN_EL1, IMP_SCCR_SET0_L2_EL1, IMP_SCCR_SET1_L2_EL1). 

For more details of sector cache function, 
see A64FX HPC extension specification (1.2. Sector cache) in 
https://github.com/fujitsu/A64FX 

[Difference between resctrl(CAT) and this sector cache function] 
L2/L3 CAT (Cache Allocation Technology) enables the user to specify 
some physical partition of cache space that an application can fill. 
A64FX's L1D/L2 cache has 4 sectors and 16ways. This sector function 
enables a user to specify number of ways each sector uses. 
Therefore, for CAT it is enough to specify a cache portion for 
each cache_id (socket). On the other hand, sector cache needs to 
specify cache portion of each sector for each cache_id, and following 
extension to resctrl interface is needed to support sector cache. 

[Idear for A64FX sector cache function control interface (schemata file details)] 
L1:<cache_id0>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;<cache_id1>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;…  
L2:<cache_id0>=>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;<cache_id1>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;… 

・L1: Add a new interface to control the L1D cache. 
・<cwbm>,<cwbm>,<cwbm>,<cwbm>:Specify the number of ways for each sector. 
・cwbm:Specify the number of ways in each sector as a bitmap (percentage), 
  but the bitmap does not indicate the location of the cache. 
* In the sector cache function, L2 sector cache way setting register is 
  shared among PEs (Processor Element) in shared domain. If two PEs 
  which share L2 cache belongs to different resource groups, one resource 
  group's L2 setting will affect to other resource group's L2 setting. 
* Since A64FX does not support MPAM, it is not necessary to consider 
  how to switch between MPAM and sector cache function now. 

Some questions: 
1.I'm still studying about RDT, could you tell me whether RDT has 
  the similar mechanism with sector cache function? 
2.In RDT, L3 cache is shared among cores in socket. If two cores which 
  share L3 cache belongs to different resource groups, one resource 
  group's L3 setting will affect to other resource group's L3 setting? 
3.Is this approach acceptable? could you give me some advice? 


Best regards 
Tan Shaopeng 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: About add an A64FX cache control function into resctrl
  2021-04-09  5:46 About add an A64FX cache control function into resctrl tan.shaopeng
@ 2021-04-21  8:37 ` tan.shaopeng
  2021-04-21 16:39   ` Reinette Chatre
  0 siblings, 1 reply; 20+ messages in thread
From: tan.shaopeng @ 2021-04-21  8:37 UTC (permalink / raw)
  To: 'fenghua.yu@intel.com', 'reinette.chatre@intel.com'
  Cc: 'linux-kernel@vger.kernel.org',
	'linux-arm-kernel@lists.infradead.org',
	'James Morse',
	misono.tomohiro

Hi,

Ping... any comments&advice about add an A64FX cache control function into resctrl?

Best regards
Tan Shaopeng

> Hello
> 
> 
> I'm Tan Shaopeng from Fujitsu Limited.
> 
> I’m trying to implement Fujitsu A64FX’s cache related features.
> It is a cache partitioning function we called sector cache function that using
> the value of the tag that is upper 8 bits of the 64bit address and the value of the
> sector cache register to control virtual cache capacity of the L1D&L2 cache.
> 
> A few days ago, when I sent a driver that realizes this function to
> ARM64 kernel community, Will Deacon and Arnd Bergmann suggested an idea
> to add the sector cache function of A64FX into resctrl.
> https://lore.kernel.org/linux-arm-kernel/CAK8P3a2pFcNTw9NpRtQfYr7A5Oc
> Z=As2kM0D_sbfFcGQ_J2Q+Q@mail.gmail.com/
> 
> Based on my study, I think the sector cache function of A64FX can be added
> into the allocation features of resctrl after James' resctrl rework has finished.
> But, in order to implement this function, more interfaces for resctrl are need.
> The details are as follow, and could you give me some advice?
> 
> [Sector cache function]
> The sector cache function split cache into multiple sectors and control them
> separately. It is implemented on the L1D cache and
> L2 cache in the A64FX processor and can be controlled individually for L1D
> cache and L2 cache. A64FX has no L3 cache. Each L1D cache and L2 cache
> has 4 sectors. Which L1D sector is used is specified by the value of [57:56] bits
> of address, how many ways of sector are specified by the value of register
> (IMP_SCCR_L1_EL0).
> Which L2 sector is used is specified by the value of [56] bits of address, and
> how many ways of sector are specified by value of register
> (IMP_SCCR_ASSIGN_EL1, IMP_SCCR_SET0_L2_EL1,
> IMP_SCCR_SET1_L2_EL1).
> 
> For more details of sector cache function, see A64FX HPC extension
> specification (1.2. Sector cache) in https://github.com/fujitsu/A64FX
> 
> [Difference between resctrl(CAT) and this sector cache function]
> L2/L3 CAT (Cache Allocation Technology) enables the user to specify some
> physical partition of cache space that an application can fill.
> A64FX's L1D/L2 cache has 4 sectors and 16ways. This sector function enables
> a user to specify number of ways each sector uses.
> Therefore, for CAT it is enough to specify a cache portion for each cache_id
> (socket). On the other hand, sector cache needs to specify cache portion of
> each sector for each cache_id, and following extension to resctrl interface is
> needed to support sector cache.
> 
> [Idear for A64FX sector cache function control interface (schemata file
> details)]
> L1:<cache_id0>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;<cache_id1>=<cw
> bm>,<cwbm>,<cwbm>,<cwbm>;…
> L2:<cache_id0>=>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;<cache_id1>=
> <cwbm>,<cwbm>,<cwbm>,<cwbm>;…
> 
> ・L1: Add a new interface to control the L1D cache.
> ・<cwbm>,<cwbm>,<cwbm>,<cwbm>:Specify the number of ways for each
> sector.
> ・cwbm:Specify the number of ways in each sector as a bitmap (percentage),
>   but the bitmap does not indicate the location of the cache.
> * In the sector cache function, L2 sector cache way setting register is
>   shared among PEs (Processor Element) in shared domain. If two PEs
>   which share L2 cache belongs to different resource groups, one resource
>   group's L2 setting will affect to other resource group's L2 setting.
> * Since A64FX does not support MPAM, it is not necessary to consider
>   how to switch between MPAM and sector cache function now.
> 
> Some questions:
> 1.I'm still studying about RDT, could you tell me whether RDT has
>   the similar mechanism with sector cache function?
> 2.In RDT, L3 cache is shared among cores in socket. If two cores which
>   share L3 cache belongs to different resource groups, one resource
>   group's L3 setting will affect to other resource group's L3 setting?
> 3.Is this approach acceptable? could you give me some advice?
> 
> 
> Best regards
> Tan Shaopeng


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: About add an A64FX cache control function into resctrl
  2021-04-21  8:37 ` tan.shaopeng
@ 2021-04-21 16:39   ` Reinette Chatre
  2021-04-23  8:10     ` tan.shaopeng
                       ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Reinette Chatre @ 2021-04-21 16:39 UTC (permalink / raw)
  To: tan.shaopeng, 'fenghua.yu@intel.com'
  Cc: 'linux-kernel@vger.kernel.org',
	'linux-arm-kernel@lists.infradead.org',
	'James Morse',
	misono.tomohiro

Hi Tan Shaopeng,

On 4/21/2021 1:37 AM, tan.shaopeng@fujitsu.com wrote:
> Hi,
> 
> Ping... any comments&advice about add an A64FX cache control function into resctrl?

My apologies for the delay.

> 
> Best regards
> Tan Shaopeng
> 
>> Hello
>>
>>
>> I'm Tan Shaopeng from Fujitsu Limited.
>>
>> I’m trying to implement Fujitsu A64FX’s cache related features.
>> It is a cache partitioning function we called sector cache function that using
>> the value of the tag that is upper 8 bits of the 64bit address and the value of the
>> sector cache register to control virtual cache capacity of the L1D&L2 cache.
>>
>> A few days ago, when I sent a driver that realizes this function to
>> ARM64 kernel community, Will Deacon and Arnd Bergmann suggested an idea
>> to add the sector cache function of A64FX into resctrl.
>> https://lore.kernel.org/linux-arm-kernel/CAK8P3a2pFcNTw9NpRtQfYr7A5Oc
>> Z=As2kM0D_sbfFcGQ_J2Q+Q@mail.gmail.com/
>>
>> Based on my study, I think the sector cache function of A64FX can be added
>> into the allocation features of resctrl after James' resctrl rework has finished.
>> But, in order to implement this function, more interfaces for resctrl are need.
>> The details are as follow, and could you give me some advice?
>>
>> [Sector cache function]
>> The sector cache function split cache into multiple sectors and control them
>> separately. It is implemented on the L1D cache and
>> L2 cache in the A64FX processor and can be controlled individually for L1D
>> cache and L2 cache. A64FX has no L3 cache. Each L1D cache and L2 cache
>> has 4 sectors. Which L1D sector is used is specified by the value of [57:56] bits
>> of address, how many ways of sector are specified by the value of register
>> (IMP_SCCR_L1_EL0).
>> Which L2 sector is used is specified by the value of [56] bits of address, and
>> how many ways of sector are specified by value of register
>> (IMP_SCCR_ASSIGN_EL1, IMP_SCCR_SET0_L2_EL1,
>> IMP_SCCR_SET1_L2_EL1).
>>
>> For more details of sector cache function, see A64FX HPC extension
>> specification (1.2. Sector cache) in https://github.com/fujitsu/A64FX

The overview in section 12 was informative but very high level.
I was not able to find any instance of "IMP_SCCR" in this document to  
explore how this cache allocation works.

Are these cache sectors exposed to the OS in any way? For example, when  
the OS discovers the cache, does it learn about these sectors and expose  
the details to user space (/sys/devices/system/cpuX/cache)?

The overview of Sector Cache in that document provides details of how  
the size of the sector itself is dynamically adjusted to usage. That  
description is quite cryptic but it seems like a sector, since the  
number of ways associated with it can dynamically change, is more  
equivalent to a class of service or resource group in the resctrl  
environment.

I really may be interpreting things wrong here, could you perhaps point  
me to where I can obtain more details?


>> [Difference between resctrl(CAT) and this sector cache function]
>> L2/L3 CAT (Cache Allocation Technology) enables the user to specify some
>> physical partition of cache space that an application can fill.
>> A64FX's L1D/L2 cache has 4 sectors and 16ways. This sector function enables
>> a user to specify number of ways each sector uses.
>> Therefore, for CAT it is enough to specify a cache portion for each cache_id
>> (socket). On the other hand, sector cache needs to specify cache portion of
>> each sector for each cache_id, and following extension to resctrl interface is
>> needed to support sector cache.
>>
>> [Idear for A64FX sector cache function control interface (schemata file
>> details)]
>> L1:<cache_id0>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;<cache_id1>=<cw
>> bm>,<cwbm>,<cwbm>,<cwbm>;…
>> L2:<cache_id0>=>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;<cache_id1>=
>> <cwbm>,<cwbm>,<cwbm>,<cwbm>;…
>>
>> ・L1: Add a new interface to control the L1D cache.
>> ・<cwbm>,<cwbm>,<cwbm>,<cwbm>:Specify the number of ways for each
>> sector.
>> ・cwbm:Specify the number of ways in each sector as a bitmap (percentage),
>>    but the bitmap does not indicate the location of the cache.
>> * In the sector cache function, L2 sector cache way setting register is
>>    shared among PEs (Processor Element) in shared domain. If two PEs
>>    which share L2 cache belongs to different resource groups, one resource
>>    group's L2 setting will affect to other resource group's L2 setting.

In resctrl a "resource group" can be viewed as a class of service.

>> * Since A64FX does not support MPAM, it is not necessary to consider
>>    how to switch between MPAM and sector cache function now.
>>
>> Some questions:
>> 1.I'm still studying about RDT, could you tell me whether RDT has
>>    the similar mechanism with sector cache function?

This is not clear to me yet. One thing to keep in mind is that a bit in  
the capacity bitmask could correspond to some number of ways in a cache,  
but it does not have to. It is essentially a hint to hardware on how  
much cache space needs to be allocated while also indicating overlap and  
isolation from other allocations.

resctrl already supports the bitmask being interpreted differently  
between architectures and with the MPAM support there will be even more  
support for different interpretations.

>> 2.In RDT, L3 cache is shared among cores in socket. If two cores which
>>    share L3 cache belongs to different resource groups, one resource
>>    group's L3 setting will affect to other resource group's L3 setting?

This question is not entirely clear to me. Are you referring to the  
hardware layout or configuration changes via the resctrl "cpus" file?

Each resource group is a class of service (CLOS) that is supported by  
all cache instances. By default each resource group would thus contain  
all cache instances on the system (even if some cache instances do not  
support the same number of CLOS resctrl would only support the CLOS  
supported by all resources).

Reinette

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: About add an A64FX cache control function into resctrl
  2021-04-21 16:39   ` Reinette Chatre
@ 2021-04-23  8:10     ` tan.shaopeng
  2021-04-28  8:16     ` tan.shaopeng
  2021-05-17  8:37     ` tan.shaopeng
  2 siblings, 0 replies; 20+ messages in thread
From: tan.shaopeng @ 2021-04-23  8:10 UTC (permalink / raw)
  To: 'Reinette Chatre', 'fenghua.yu@intel.com'
  Cc: 'linux-kernel@vger.kernel.org',
	'linux-arm-kernel@lists.infradead.org',
	'James Morse',
	misono.tomohiro

Hi Reinette,

> On 4/21/2021 1:37 AM, tan.shaopeng@fujitsu.com wrote:
> > Hi,
> >
> > Ping... any comments&advice about add an A64FX cache control function
> into resctrl?
> 
> My apologies for the delay.
> 
> >
> > Best regards
> > Tan Shaopeng
> >
> >> Hello
> >>
> >>
> >> I'm Tan Shaopeng from Fujitsu Limited.
> >>
> >> I’m trying to implement Fujitsu A64FX’s cache related features.
> >> It is a cache partitioning function we called sector cache function
> >> that using the value of the tag that is upper 8 bits of the 64bit
> >> address and the value of the sector cache register to control virtual cache
> capacity of the L1D&L2 cache.
> >>
> >> A few days ago, when I sent a driver that realizes this function to
> >> ARM64 kernel community, Will Deacon and Arnd Bergmann suggested an
> >> idea to add the sector cache function of A64FX into resctrl.
> >>
> https://lore.kernel.org/linux-arm-kernel/CAK8P3a2pFcNTw9NpRtQfYr7A5Oc
> >> Z=As2kM0D_sbfFcGQ_J2Q+Q@mail.gmail.com/
> >>
> >> Based on my study, I think the sector cache function of A64FX can be
> >> added into the allocation features of resctrl after James' resctrl rework has
> finished.
> >> But, in order to implement this function, more interfaces for resctrl are
> need.
> >> The details are as follow, and could you give me some advice?
> >>
> >> [Sector cache function]
> >> The sector cache function split cache into multiple sectors and
> >> control them separately. It is implemented on the L1D cache and
> >> L2 cache in the A64FX processor and can be controlled individually
> >> for L1D cache and L2 cache. A64FX has no L3 cache. Each L1D cache and
> >> L2 cache has 4 sectors. Which L1D sector is used is specified by the
> >> value of [57:56] bits of address, how many ways of sector are
> >> specified by the value of register (IMP_SCCR_L1_EL0).
> >> Which L2 sector is used is specified by the value of [56] bits of
> >> address, and how many ways of sector are specified by value of
> >> register (IMP_SCCR_ASSIGN_EL1, IMP_SCCR_SET0_L2_EL1,
> >> IMP_SCCR_SET1_L2_EL1).
> >>
> >> For more details of sector cache function, see A64FX HPC extension
> >> specification (1.2. Sector cache) in https://github.com/fujitsu/A64FX
> 
> The overview in section 12 was informative but very high level.
> I was not able to find any instance of "IMP_SCCR" in this document to explore
> how this cache allocation works.
> 
> Are these cache sectors exposed to the OS in any way? For example, when the
> OS discovers the cache, does it learn about these sectors and expose the
> details to user space (/sys/devices/system/cpuX/cache)?
> 
> The overview of Sector Cache in that document provides details of how the size
> of the sector itself is dynamically adjusted to usage. That description is quite
> cryptic but it seems like a sector, since the number of ways associated with it
> can dynamically change, is more equivalent to a class of service or resource
> group in the resctrl environment.
> 
> I really may be interpreting things wrong here, could you perhaps point me to
> where I can obtain more details?
> 
> 
> >> [Difference between resctrl(CAT) and this sector cache function]
> >> L2/L3 CAT (Cache Allocation Technology) enables the user to specify
> >> some physical partition of cache space that an application can fill.
> >> A64FX's L1D/L2 cache has 4 sectors and 16ways. This sector function
> >> enables a user to specify number of ways each sector uses.
> >> Therefore, for CAT it is enough to specify a cache portion for each
> >> cache_id (socket). On the other hand, sector cache needs to specify
> >> cache portion of each sector for each cache_id, and following
> >> extension to resctrl interface is needed to support sector cache.
> >>
> >> [Idear for A64FX sector cache function control interface (schemata
> >> file details)]
> >>
> L1:<cache_id0>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;<cache_id1>=<cw
> >> bm>,<cwbm>,<cwbm>,<cwbm>;…
> >>
> L2:<cache_id0>=>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;<cache_id1>=
> >> <cwbm>,<cwbm>,<cwbm>,<cwbm>;…
> >>
> >> ・L1: Add a new interface to control the L1D cache.
> >> ・<cwbm>,<cwbm>,<cwbm>,<cwbm>:Specify the number of ways for
> each
> >> sector.
> >> ・cwbm:Specify the number of ways in each sector as a bitmap
> (percentage),
> >>    but the bitmap does not indicate the location of the cache.
> >> * In the sector cache function, L2 sector cache way setting register is
> >>    shared among PEs (Processor Element) in shared domain. If two PEs
> >>    which share L2 cache belongs to different resource groups, one
> resource
> >>    group's L2 setting will affect to other resource group's L2 setting.
> 
> In resctrl a "resource group" can be viewed as a class of service.
> 
> >> * Since A64FX does not support MPAM, it is not necessary to consider
> >>    how to switch between MPAM and sector cache function now.
> >>
> >> Some questions:
> >> 1.I'm still studying about RDT, could you tell me whether RDT has
> >>    the similar mechanism with sector cache function?
> 
> This is not clear to me yet. One thing to keep in mind is that a bit in the capacity
> bitmask could correspond to some number of ways in a cache, but it does not
> have to. It is essentially a hint to hardware on how much cache space needs to
> be allocated while also indicating overlap and isolation from other allocations.
> 
> resctrl already supports the bitmask being interpreted differently between
> architectures and with the MPAM support there will be even more support for
> different interpretations.
> 
> >> 2.In RDT, L3 cache is shared among cores in socket. If two cores which
> >>    share L3 cache belongs to different resource groups, one resource
> >>    group's L3 setting will affect to other resource group's L3 setting?
> 
> This question is not entirely clear to me. Are you referring to the hardware layout
> or configuration changes via the resctrl "cpus" file?
> 
> Each resource group is a class of service (CLOS) that is supported by all cache
> instances. By default each resource group would thus contain all cache
> instances on the system (even if some cache instances do not support the
> same number of CLOS resctrl would only support the CLOS supported by all
> resources).

Thanks for your comment. 

I am sorry that the description about the sector cache function was
difficult to understand. Since all public specifications were shown
in the URL, please give me some time, I will organize the contents of
64FX cache control function. 

Best regards, 
Tan Shaopeng

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: About add an A64FX cache control function into resctrl
  2021-04-21 16:39   ` Reinette Chatre
  2021-04-23  8:10     ` tan.shaopeng
@ 2021-04-28  8:16     ` tan.shaopeng
  2021-04-29 17:42       ` Reinette Chatre
  2021-05-17  8:37     ` tan.shaopeng
  2 siblings, 1 reply; 20+ messages in thread
From: tan.shaopeng @ 2021-04-28  8:16 UTC (permalink / raw)
  To: 'Reinette Chatre', 'fenghua.yu@intel.com'
  Cc: 'linux-kernel@vger.kernel.org',
	'linux-arm-kernel@lists.infradead.org',
	'James Morse',
	misono.tomohiro

Hi Reinette,

> On 4/21/2021 1:37 AM, tan.shaopeng@fujitsu.com wrote:
> > Hi,
> >
> > Ping... any comments&advice about add an A64FX cache control function
> into resctrl?
> 
> My apologies for the delay.
> 
> >
> > Best regards
> > Tan Shaopeng
> >
> >> Hello
> >>
> >>
> >> I'm Tan Shaopeng from Fujitsu Limited.
> >>
> >> I’m trying to implement Fujitsu A64FX’s cache related features.
> >> It is a cache partitioning function we called sector cache function
> >> that using the value of the tag that is upper 8 bits of the 64bit
> >> address and the value of the sector cache register to control virtual cache
> capacity of the L1D&L2 cache.
> >>
> >> A few days ago, when I sent a driver that realizes this function to
> >> ARM64 kernel community, Will Deacon and Arnd Bergmann suggested an
> >> idea to add the sector cache function of A64FX into resctrl.
> >>
> https://lore.kernel.org/linux-arm-kernel/CAK8P3a2pFcNTw9NpRtQfYr7A5Oc
> >> Z=As2kM0D_sbfFcGQ_J2Q+Q@mail.gmail.com/
> >>
> >> Based on my study, I think the sector cache function of A64FX can be
> >> added into the allocation features of resctrl after James' resctrl rework has
> finished.
> >> But, in order to implement this function, more interfaces for resctrl are
> need.
> >> The details are as follow, and could you give me some advice?
> >>
> >> [Sector cache function]
> >> The sector cache function split cache into multiple sectors and
> >> control them separately. It is implemented on the L1D cache and
> >> L2 cache in the A64FX processor and can be controlled individually
> >> for L1D cache and L2 cache. A64FX has no L3 cache. Each L1D cache and
> >> L2 cache has 4 sectors. Which L1D sector is used is specified by the
> >> value of [57:56] bits of address, how many ways of sector are
> >> specified by the value of register (IMP_SCCR_L1_EL0).
> >> Which L2 sector is used is specified by the value of [56] bits of
> >> address, and how many ways of sector are specified by value of
> >> register (IMP_SCCR_ASSIGN_EL1, IMP_SCCR_SET0_L2_EL1,
> >> IMP_SCCR_SET1_L2_EL1).
> >>
> >> For more details of sector cache function, see A64FX HPC extension
> >> specification (1.2. Sector cache) in https://github.com/fujitsu/A64FX
> 
> The overview in section 12 was informative but very high level.

I'm considering how to answer your questions from your email which 
I received before, when I check the email again, I am sorry that 
the information I provided before are insufficient.  

To understand the sector cache function of A64FX, could you please see  
A64FX_Microarchitecture_Manual - section 12. Sector Cache 
https://github.com/fujitsu/A64FX/blob/master/doc/A64FX_Microarchitecture_Manual_en_1.4.pdf  
and, 
A64FX_Specification_HPC_Extension ? section 1.2. Sector Cache  
https://github.com/fujitsu/A64FX/blob/master/doc/A64FX_Specification_HPC_Extension_v1_EN.pdf  

In addition, Japan will be on a long holiday about one week from 
April 29th, I will answer your other questions after the holidays.  

> I was not able to find any instance of "IMP_SCCR" in this document to explore
> how this cache allocation works.
> 
> Are these cache sectors exposed to the OS in any way? For example, when the
> OS discovers the cache, does it learn about these sectors and expose the
> details to user space (/sys/devices/system/cpuX/cache)?
> 
> The overview of Sector Cache in that document provides details of how the size
> of the sector itself is dynamically adjusted to usage. That description is quite
> cryptic but it seems like a sector, since the number of ways associated with it
> can dynamically change, is more equivalent to a class of service or resource
> group in the resctrl environment.
> 
> I really may be interpreting things wrong here, could you perhaps point me to
> where I can obtain more details?
> 
> 
> >> [Difference between resctrl(CAT) and this sector cache function]
> >> L2/L3 CAT (Cache Allocation Technology) enables the user to specify
> >> some physical partition of cache space that an application can fill.
> >> A64FX's L1D/L2 cache has 4 sectors and 16ways. This sector function
> >> enables a user to specify number of ways each sector uses.
> >> Therefore, for CAT it is enough to specify a cache portion for each
> >> cache_id (socket). On the other hand, sector cache needs to specify
> >> cache portion of each sector for each cache_id, and following
> >> extension to resctrl interface is needed to support sector cache.
> >>
> >> [Idear for A64FX sector cache function control interface (schemata
> >> file details)]
> >>
> L1:<cache_id0>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;<cache_id1>=<cw
> >> bm>,<cwbm>,<cwbm>,<cwbm>;…
> >>
> L2:<cache_id0>=>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;<cache_id1>=
> >> <cwbm>,<cwbm>,<cwbm>,<cwbm>;…
> >>
> >> ・L1: Add a new interface to control the L1D cache.
> >> ・<cwbm>,<cwbm>,<cwbm>,<cwbm>:Specify the number of ways for
> each
> >> sector.
> >> ・cwbm:Specify the number of ways in each sector as a bitmap
> (percentage),
> >>    but the bitmap does not indicate the location of the cache.
> >> * In the sector cache function, L2 sector cache way setting register is
> >>    shared among PEs (Processor Element) in shared domain. If two PEs
> >>    which share L2 cache belongs to different resource groups, one
> resource
> >>    group's L2 setting will affect to other resource group's L2 setting.
> 
> In resctrl a "resource group" can be viewed as a class of service.
> 
> >> * Since A64FX does not support MPAM, it is not necessary to consider
> >>    how to switch between MPAM and sector cache function now.
> >>
> >> Some questions:
> >> 1.I'm still studying about RDT, could you tell me whether RDT has
> >>    the similar mechanism with sector cache function?
> 
> This is not clear to me yet. One thing to keep in mind is that a bit in the capacity
> bitmask could correspond to some number of ways in a cache, but it does not
> have to. It is essentially a hint to hardware on how much cache space needs to
> be allocated while also indicating overlap and isolation from other allocations.
> 
> resctrl already supports the bitmask being interpreted differently between
> architectures and with the MPAM support there will be even more support for
> different interpretations.
> 
> >> 2.In RDT, L3 cache is shared among cores in socket. If two cores which
> >>    share L3 cache belongs to different resource groups, one resource
> >>    group's L3 setting will affect to other resource group's L3 setting?
> 
> This question is not entirely clear to me. Are you referring to the hardware layout
> or configuration changes via the resctrl "cpus" file?
> 
> Each resource group is a class of service (CLOS) that is supported by all cache
> instances. By default each resource group would thus contain all cache
> instances on the system (even if some cache instances do not support the
> same number of CLOS resctrl would only support the CLOS supported by all
> resources).

Best regards 
Tan Shaopeng

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: About add an A64FX cache control function into resctrl
  2021-04-28  8:16     ` tan.shaopeng
@ 2021-04-29 17:42       ` Reinette Chatre
  2021-04-29 17:50         ` Luck, Tony
  2021-05-17  8:31         ` tan.shaopeng
  0 siblings, 2 replies; 20+ messages in thread
From: Reinette Chatre @ 2021-04-29 17:42 UTC (permalink / raw)
  To: tan.shaopeng, 'fenghua.yu@intel.com'
  Cc: 'linux-kernel@vger.kernel.org',
	'linux-arm-kernel@lists.infradead.org',
	'James Morse',
	misono.tomohiro, Luck, Tony

Hi Tan Shaopeng,

On 4/28/2021 1:16 AM, tan.shaopeng@fujitsu.com wrote:
> Hi Reinette,
> 
>> On 4/21/2021 1:37 AM, tan.shaopeng@fujitsu.com wrote:
>>> Hi,
>>>
>>> Ping... any comments&advice about add an A64FX cache control function
>> into resctrl?
>>
>> My apologies for the delay.
>>
>>>
>>> Best regards
>>> Tan Shaopeng
>>>
>>>> Hello
>>>>
>>>>
>>>> I'm Tan Shaopeng from Fujitsu Limited.
>>>>
>>>> I’m trying to implement Fujitsu A64FX’s cache related features.
>>>> It is a cache partitioning function we called sector cache function
>>>> that using the value of the tag that is upper 8 bits of the 64bit
>>>> address and the value of the sector cache register to control virtual cache
>> capacity of the L1D&L2 cache.
>>>>
>>>> A few days ago, when I sent a driver that realizes this function to
>>>> ARM64 kernel community, Will Deacon and Arnd Bergmann suggested an
>>>> idea to add the sector cache function of A64FX into resctrl.
>>>>
>> https://lore.kernel.org/linux-arm-kernel/CAK8P3a2pFcNTw9NpRtQfYr7A5Oc
>>>> Z=As2kM0D_sbfFcGQ_J2Q+Q@mail.gmail.com/
>>>>
>>>> Based on my study, I think the sector cache function of A64FX can be
>>>> added into the allocation features of resctrl after James' resctrl rework has
>> finished.
>>>> But, in order to implement this function, more interfaces for resctrl are
>> need.
>>>> The details are as follow, and could you give me some advice?
>>>>
>>>> [Sector cache function]
>>>> The sector cache function split cache into multiple sectors and
>>>> control them separately. It is implemented on the L1D cache and
>>>> L2 cache in the A64FX processor and can be controlled individually
>>>> for L1D cache and L2 cache. A64FX has no L3 cache. Each L1D cache and
>>>> L2 cache has 4 sectors. Which L1D sector is used is specified by the
>>>> value of [57:56] bits of address, how many ways of sector are
>>>> specified by the value of register (IMP_SCCR_L1_EL0).
>>>> Which L2 sector is used is specified by the value of [56] bits of
>>>> address, and how many ways of sector are specified by value of
>>>> register (IMP_SCCR_ASSIGN_EL1, IMP_SCCR_SET0_L2_EL1,
>>>> IMP_SCCR_SET1_L2_EL1).
>>>>
>>>> For more details of sector cache function, see A64FX HPC extension
>>>> specification (1.2. Sector cache) in https://github.com/fujitsu/A64FX
>>
>> The overview in section 12 was informative but very high level.
> 
> I'm considering how to answer your questions from your email which
> I received before, when I check the email again, I am sorry that
> the information I provided before are insufficient.
> 
> To understand the sector cache function of A64FX, could you please see
> A64FX_Microarchitecture_Manual - section 12. Sector Cache
> https://github.com/fujitsu/A64FX/blob/master/doc/A64FX_Microarchitecture_Manual_en_1.4.pdf
> and,
> A64FX_Specification_HPC_Extension ? section 1.2. Sector Cache
> https://github.com/fujitsu/A64FX/blob/master/doc/A64FX_Specification_HPC_Extension_v1_EN.pdf

Thank you for the direct links - I missed that there are two documents  
available.

After reading the spec portion it does seem to me even more as though  
"sectors" could be considered the same as the resctrl "classes of  
service". The Fujitsu hardware supports four sectors that can be  
configured with different number of ways using the registers you mention  
above. In resctrl this could be considered as hardware that supports  
four classes of service and each class of service can be allocated a  
different number of ways.

The other part is how hardware knows which sector is being used at any  
moment in time. In resctrl that is programmed by writing the active  
class of service into needed register at the time the application is  
context switched (resctrl_sched_in()). This seems different here since  
as you describe the sector is chosen by bits in the address. Even so,  
which bits to set in the address needs to be programmed also and I also  
understand that there is a "default" sector that can be programmed via  
register. Could these be equivalent to what is done currently in resctrl?

(Could you please also consider my original questions?)

> 
> In addition, Japan will be on a long holiday about one week from
> April 29th, I will answer your other questions after the holidays.
> 
>> I was not able to find any instance of "IMP_SCCR" in this document to explore
>> how this cache allocation works.
>>
>> Are these cache sectors exposed to the OS in any way? For example, when the
>> OS discovers the cache, does it learn about these sectors and expose the
>> details to user space (/sys/devices/system/cpuX/cache)?
>>
>> The overview of Sector Cache in that document provides details of how the size
>> of the sector itself is dynamically adjusted to usage. That description is quite
>> cryptic but it seems like a sector, since the number of ways associated with it
>> can dynamically change, is more equivalent to a class of service or resource
>> group in the resctrl environment.
>>
>> I really may be interpreting things wrong here, could you perhaps point me to
>> where I can obtain more details?
>>
>>
>>>> [Difference between resctrl(CAT) and this sector cache function]
>>>> L2/L3 CAT (Cache Allocation Technology) enables the user to specify
>>>> some physical partition of cache space that an application can fill.
>>>> A64FX's L1D/L2 cache has 4 sectors and 16ways. This sector function
>>>> enables a user to specify number of ways each sector uses.
>>>> Therefore, for CAT it is enough to specify a cache portion for each
>>>> cache_id (socket). On the other hand, sector cache needs to specify
>>>> cache portion of each sector for each cache_id, and following
>>>> extension to resctrl interface is needed to support sector cache.
>>>>
>>>> [Idear for A64FX sector cache function control interface (schemata
>>>> file details)]
>>>>
>> L1:<cache_id0>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;<cache_id1>=<cw
>>>> bm>,<cwbm>,<cwbm>,<cwbm>;…
>>>>
>> L2:<cache_id0>=>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;<cache_id1>=
>>>> <cwbm>,<cwbm>,<cwbm>,<cwbm>;…
>>>>
>>>> ・L1: Add a new interface to control the L1D cache.
>>>> ・<cwbm>,<cwbm>,<cwbm>,<cwbm>:Specify the number of ways for
>> each
>>>> sector.
>>>> ・cwbm:Specify the number of ways in each sector as a bitmap
>> (percentage),
>>>>     but the bitmap does not indicate the location of the cache.
>>>> * In the sector cache function, L2 sector cache way setting register is
>>>>     shared among PEs (Processor Element) in shared domain. If two PEs
>>>>     which share L2 cache belongs to different resource groups, one
>> resource
>>>>     group's L2 setting will affect to other resource group's L2 setting.
>>
>> In resctrl a "resource group" can be viewed as a class of service.
>>
>>>> * Since A64FX does not support MPAM, it is not necessary to consider
>>>>     how to switch between MPAM and sector cache function now.
>>>>
>>>> Some questions:
>>>> 1.I'm still studying about RDT, could you tell me whether RDT has
>>>>     the similar mechanism with sector cache function?
>>
>> This is not clear to me yet. One thing to keep in mind is that a bit in the capacity
>> bitmask could correspond to some number of ways in a cache, but it does not
>> have to. It is essentially a hint to hardware on how much cache space needs to
>> be allocated while also indicating overlap and isolation from other allocations.
>>
>> resctrl already supports the bitmask being interpreted differently between
>> architectures and with the MPAM support there will be even more support for
>> different interpretations.
>>
>>>> 2.In RDT, L3 cache is shared among cores in socket. If two cores which
>>>>     share L3 cache belongs to different resource groups, one resource
>>>>     group's L3 setting will affect to other resource group's L3 setting?
>>
>> This question is not entirely clear to me. Are you referring to the hardware layout
>> or configuration changes via the resctrl "cpus" file?
>>
>> Each resource group is a class of service (CLOS) that is supported by all cache
>> instances. By default each resource group would thus contain all cache
>> instances on the system (even if some cache instances do not support the
>> same number of CLOS resctrl would only support the CLOS supported by all
>> resources).
> 

Reinette

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: About add an A64FX cache control function into resctrl
  2021-04-29 17:42       ` Reinette Chatre
@ 2021-04-29 17:50         ` Luck, Tony
  2021-04-30 11:46           ` Catalin Marinas
  2021-05-17  8:31         ` tan.shaopeng
  1 sibling, 1 reply; 20+ messages in thread
From: Luck, Tony @ 2021-04-29 17:50 UTC (permalink / raw)
  To: Chatre, Reinette, tan.shaopeng, Yu, Fenghua
  Cc: 'linux-kernel@vger.kernel.org',
	'linux-arm-kernel@lists.infradead.org',
	'James Morse',
	misono.tomohiro

>>>> [Sector cache function]
>>>> The sector cache function split cache into multiple sectors and
>>>> control them separately. It is implemented on the L1D cache and
>>>> L2 cache in the A64FX processor and can be controlled individually
>>>> for L1D cache and L2 cache. A64FX has no L3 cache. Each L1D cache and
>>>> L2 cache has 4 sectors. Which L1D sector is used is specified by the
>>>> value of [57:56] bits of address, how many ways of sector are
>>>> specified by the value of register (IMP_SCCR_L1_EL0).
>>>> Which L2 sector is used is specified by the value of [56] bits of
>>>> address, and how many ways of sector are specified by value of
>>>> register (IMP_SCCR_ASSIGN_EL1, IMP_SCCR_SET0_L2_EL1,
>>>> IMP_SCCR_SET1_L2_EL1).

Are A64FX binaries position independent?  I.e. could the OS reassign
a running task to a different sector by remapping it to different virtual
addresses during a context switch?

Or is this a static property at task launch?

-Tony

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: About add an A64FX cache control function into resctrl
  2021-04-29 17:50         ` Luck, Tony
@ 2021-04-30 11:46           ` Catalin Marinas
  2021-05-17  8:29             ` tan.shaopeng
  0 siblings, 1 reply; 20+ messages in thread
From: Catalin Marinas @ 2021-04-30 11:46 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Chatre, Reinette, tan.shaopeng, Yu, Fenghua,
	'linux-kernel@vger.kernel.org',
	'linux-arm-kernel@lists.infradead.org',
	'James Morse',
	misono.tomohiro

On Thu, Apr 29, 2021 at 05:50:20PM +0000, Luck, Tony wrote:
> >>>> [Sector cache function]
> >>>> The sector cache function split cache into multiple sectors and
> >>>> control them separately. It is implemented on the L1D cache and
> >>>> L2 cache in the A64FX processor and can be controlled individually
> >>>> for L1D cache and L2 cache. A64FX has no L3 cache. Each L1D cache and
> >>>> L2 cache has 4 sectors. Which L1D sector is used is specified by the
> >>>> value of [57:56] bits of address, how many ways of sector are
> >>>> specified by the value of register (IMP_SCCR_L1_EL0).
> >>>> Which L2 sector is used is specified by the value of [56] bits of
> >>>> address, and how many ways of sector are specified by value of
> >>>> register (IMP_SCCR_ASSIGN_EL1, IMP_SCCR_SET0_L2_EL1,
> >>>> IMP_SCCR_SET1_L2_EL1).
> 
> Are A64FX binaries position independent?  I.e. could the OS reassign
> a running task to a different sector by remapping it to different virtual
> addresses during a context switch?

Arm64 supports a maximum of 52-bit of virtual or physical addresses. The
maximum the MMU would produce would be a 52-bit output address. I
presume bits 56, 57 of the address bus are used for some cache affinity
(sector selection) but they don't influence the memory addressing, nor
could the MMU set them.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: About add an A64FX cache control function into resctrl
  2021-04-30 11:46           ` Catalin Marinas
@ 2021-05-17  8:29             ` tan.shaopeng
  0 siblings, 0 replies; 20+ messages in thread
From: tan.shaopeng @ 2021-05-17  8:29 UTC (permalink / raw)
  To: 'Catalin Marinas', Luck, Tony
  Cc: Chatre, Reinette, Yu, Fenghua,
	'linux-kernel@vger.kernel.org',
	'linux-arm-kernel@lists.infradead.org',
	'James Morse',
	misono.tomohiro

Hi, Tony, Catalin

> On Thu, Apr 29, 2021 at 05:50:20PM +0000, Luck, Tony wrote:
> > >>>> [Sector cache function]
> > >>>> The sector cache function split cache into multiple sectors and
> > >>>> control them separately. It is implemented on the L1D cache and
> > >>>> L2 cache in the A64FX processor and can be controlled
> > >>>> individually for L1D cache and L2 cache. A64FX has no L3 cache.
> > >>>> Each L1D cache and
> > >>>> L2 cache has 4 sectors. Which L1D sector is used is specified by
> > >>>> the value of [57:56] bits of address, how many ways of sector are
> > >>>> specified by the value of register (IMP_SCCR_L1_EL0).
> > >>>> Which L2 sector is used is specified by the value of [56] bits of
> > >>>> address, and how many ways of sector are specified by value of
> > >>>> register (IMP_SCCR_ASSIGN_EL1, IMP_SCCR_SET0_L2_EL1,
> > >>>> IMP_SCCR_SET1_L2_EL1).
> >
> > Are A64FX binaries position independent?  I.e. could the OS reassign a
> > running task to a different sector by remapping it to different
> > virtual addresses during a context switch?
> 
> Arm64 supports a maximum of 52-bit of virtual or physical addresses. The
> maximum the MMU would produce would be a 52-bit output address. I
> presume bits 56, 57 of the address bus are used for some cache affinity (sector
> selection) but they don't influence the memory addressing, nor could the MMU
> set them.
Yes, A64FX binaries are position independent. Arm64 supports 
a maximum of 52-bit of virtual or physical address. On A64FX, 
the [56:57] bits of virtual addresses are used for some cache 
affinity (sector selection) and set by user program instead of MMU.

Best regards,
Tan Shaopeng

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: About add an A64FX cache control function into resctrl
  2021-04-29 17:42       ` Reinette Chatre
  2021-04-29 17:50         ` Luck, Tony
@ 2021-05-17  8:31         ` tan.shaopeng
  2021-05-21 17:44           ` Reinette Chatre
  1 sibling, 1 reply; 20+ messages in thread
From: tan.shaopeng @ 2021-05-17  8:31 UTC (permalink / raw)
  To: 'Reinette Chatre', 'fenghua.yu@intel.com'
  Cc: 'linux-kernel@vger.kernel.org',
	'linux-arm-kernel@lists.infradead.org',
	'James Morse',
	misono.tomohiro, Luck, Tony

Hi Reinette,

I’m sorry for the late reply. 
I think I could not explain A64FX’s sector cache function well in 
my first mail. While answering the question, I will also explain 
this function in more detail. Though maybe you have already learned 
more about this function by reading specification and manual, 
in order to better understand this function, some contents may have 
duplicate explanations.

> >> The overview in section 12 was informative but very high level.
> >
> > I'm considering how to answer your questions from your email which I
> > received before, when I check the email again, I am sorry that the
> > information I provided before are insufficient.
> >
> > To understand the sector cache function of A64FX, could you please see
> > A64FX_Microarchitecture_Manual - section 12. Sector Cache
> >
> https://github.com/fujitsu/A64FX/blob/master/doc/A64FX_Microarchitectu
> > re_Manual_en_1.4.pdf
> > and,
> > A64FX_Specification_HPC_Extension ? section 1.2. Sector Cache
> >
> https://github.com/fujitsu/A64FX/blob/master/doc/A64FX_Specification_H
> > PC_Extension_v1_EN.pdf
> 
> Thank you for the direct links - I missed that there are two documents available.
> 
> After reading the spec portion it does seem to me even more as though
> "sectors" could be considered the same as the resctrl "classes of service". The
> Fujitsu hardware supports four sectors that can be configured with different
> number of ways using the registers you mention above. In resctrl this could be
> considered as hardware that supports four classes of service and each class of
> service can be allocated a different number of ways.

Fujitsu hardware supports four sectors that can be configured with 
different number of ways by using "IMP_SCCR" registers, and when this 
function is added into resctrl, the maximum ways of each sector are 
indicated by bitmap. 

However, A64FX's L2 cache setting registers are shared among PEs 
(Processor Element) in NUMA. If two PEs in the same NUMA are assigned 
to different resource groups, changing one PE's L2 setting on one 
resource group, the other PE's L2 setting on other resource groups 
will be influenced. So, adding this function into resctrl, we will 
assign NUMA to the resource group. (On F64FX, each NUMA has 12 PEs, 
and each PE has L1 cache setting registers, but these registers are 
not shared.) There are 4 NUMAs on A64FX, 4 NUMAs could be considered 
as hardware that supports four classes of service at most, and each 
class of service has 4 sectors (4 L1 sectors& 4 L2 sectors), 
and each sector can be allocated a different number of ways. 
And, when a running task on resource group, the [56:57] bits of 
virtual address are used for sector selection (cache affinity).

> The other part is how hardware knows which sector is being used at any
> moment in time. In resctrl that is programmed by writing the active class of
> service into needed register at the time the application is context switched
> (resctrl_sched_in()). This seems different here since as you describe the
> sector is chosen by bits in the address. Even so, which bits to set in the
> address needs to be programmed also and I also understand that there is a
> "default" sector that can be programmed via register. Could these be equivalent
> to what is done currently in resctrl?

Adding this function into resctrl, there is no need to write active 
class of service into needed register. When running a task, the sector 
id is decided by [56:57] bits of virtual address, and these bits are 
programed by users. When creating a resource group, the maximum number 
of ways of each sector are set by "IMP_SCCR" setting registers. 
As long as the task is running in a certain resource group, the sector 
and the maximum number of ways of sectors are used will not be changed. 
Therefore, we need not consider context switches on A64FX.

> (Could you please also consider my original questions?)
I will reply to the original questions mail. 


Best regards,
Tan Shaopeng

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: About add an A64FX cache control function into resctrl
  2021-04-21 16:39   ` Reinette Chatre
  2021-04-23  8:10     ` tan.shaopeng
  2021-04-28  8:16     ` tan.shaopeng
@ 2021-05-17  8:37     ` tan.shaopeng
  2 siblings, 0 replies; 20+ messages in thread
From: tan.shaopeng @ 2021-05-17  8:37 UTC (permalink / raw)
  To: 'Reinette Chatre', 'fenghua.yu@intel.com'
  Cc: 'linux-kernel@vger.kernel.org',
	'linux-arm-kernel@lists.infradead.org',
	'James Morse',
	misono.tomohiro

Hi Reinette,

> On 4/21/2021 1:37 AM, tan.shaopeng@fujitsu.com wrote:
> > Hi,
> >
> > Ping... any comments&advice about add an A64FX cache control function
> into resctrl?
> 
> My apologies for the delay.
> 
> >
> > Best regards
> > Tan Shaopeng
> >
> >> Hello
> >>
> >>
> >> I'm Tan Shaopeng from Fujitsu Limited.
> >>
> >> I’m trying to implement Fujitsu A64FX’s cache related features.
> >> It is a cache partitioning function we called sector cache function
> >> that using the value of the tag that is upper 8 bits of the 64bit
> >> address and the value of the sector cache register to control virtual cache
> capacity of the L1D&L2 cache.
> >>
> >> A few days ago, when I sent a driver that realizes this function to
> >> ARM64 kernel community, Will Deacon and Arnd Bergmann suggested an
> >> idea to add the sector cache function of A64FX into resctrl.
> >>
> https://lore.kernel.org/linux-arm-kernel/CAK8P3a2pFcNTw9NpRtQfYr7A5Oc
> >> Z=As2kM0D_sbfFcGQ_J2Q+Q@mail.gmail.com/
> >>
> >> Based on my study, I think the sector cache function of A64FX can be
> >> added into the allocation features of resctrl after James' resctrl rework has
> finished.
> >> But, in order to implement this function, more interfaces for resctrl are
> need.
> >> The details are as follow, and could you give me some advice?
> >>
> >> [Sector cache function]
> >> The sector cache function split cache into multiple sectors and
> >> control them separately. It is implemented on the L1D cache and
> >> L2 cache in the A64FX processor and can be controlled individually
> >> for L1D cache and L2 cache. A64FX has no L3 cache. Each L1D cache and
> >> L2 cache has 4 sectors. Which L1D sector is used is specified by the
> >> value of [57:56] bits of address, how many ways of sector are
> >> specified by the value of register (IMP_SCCR_L1_EL0).
> >> Which L2 sector is used is specified by the value of [56] bits of
> >> address, and how many ways of sector are specified by value of
> >> register (IMP_SCCR_ASSIGN_EL1, IMP_SCCR_SET0_L2_EL1,
> >> IMP_SCCR_SET1_L2_EL1).
> >>
> >> For more details of sector cache function, see A64FX HPC extension
> >> specification (1.2. Sector cache) in https://github.com/fujitsu/A64FX
> 
> The overview in section 12 was informative but very high level.
> I was not able to find any instance of "IMP_SCCR" in this document to explore
> how this cache allocation works.

Maybe you have already known, the sector cache works as follows. 
 - Set the maximum number of ways for sector (id = 0/1/2/3) of L1&L2 
   by setting the "IMP_SCCR" register. 
 - When running a task, the sector id is specified by [56:57] bits of 
   the virtual address. If the sector id is not specified, the sector 
   id specified in IMP_SCCR_ASSIGN_EL1.default_sector will be used.

> Are these cache sectors exposed to the OS in any way? For example, when the
> OS discovers the cache, does it learn about these sectors and expose the
> details to user space (/sys/devices/system/cpuX/cache)?

These cache sectors are not exposed to the OS in any way.

> The overview of Sector Cache in that document provides details of how the size
> of the sector itself is dynamically adjusted to usage. That description is quite
> cryptic but it seems like a sector, since the number of ways associated with it
> can dynamically change, is more equivalent to a class of service or resource
> group in the resctrl environment.

I explained the difference between "sector" and "class of service" 
in another email.

> I really may be interpreting things wrong here, could you perhaps point me to
> where I can obtain more details?

I'm sorry, there is no documentation other than the manual and 
specifications. More details about how sector cache function works 
as follows. 
(1) By setting the access control register IMP_SCCR_CTRL_EL1, cache 
    capacity setting registers (IMP_SCCR_CTRL_EL1, IMP_SCCR_ASSIGN_EL1, 
    IMP_SCCR_L1_EL0, IMP_SCCR_SET0_L2_EL1, IMP_SCCR_SET1_L2_EL1, 
    IMP_SCCR_VSCCR_L2_EL0) can be set from user space or kernel space. 
(2) Set L1 sector cache capacity register from kernel space. 
    By setting the register IMP_SCCR_L1_EL0, set the maximum number 
    of ways of sector (id = 0/1/2/3) of L1. 
(3) Set L2 sector cache capacity register. 
    (one of cases) By setting IMP_SCCR_ASSIGN_EL1.assign = 0 from 
    kernel space, IMP_SCCR_VSCCR_L2_EL0 becomes alias of 
    IMP_SCCR_SET0_L2_EL1. By setting IMP_SCCR_VSCCR_L2_EL0 from user 
    space, set the maximum number of ways of sector (id = 0/1) of L2. 
(4) When running a task, the sector ID of L1 is decided by [56:57] 
    bits of virtual address, and the sector ID of L2 is decided by 
    [56] bit of the address. These bits are programed by the users.

> >> [Difference between resctrl(CAT) and this sector cache function]
> >> L2/L3 CAT (Cache Allocation Technology) enables the user to specify
> >> some physical partition of cache space that an application can fill.
> >> A64FX's L1D/L2 cache has 4 sectors and 16ways. This sector function
> >> enables a user to specify number of ways each sector uses.
> >> Therefore, for CAT it is enough to specify a cache portion for each
> >> cache_id (socket). On the other hand, sector cache needs to specify
> >> cache portion of each sector for each cache_id, and following
> >> extension to resctrl interface is needed to support sector cache.
> >>
> >> [Idear for A64FX sector cache function control interface (schemata
> >> file details)]
> >>
> L1:<cache_id0>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;<cache_id1>=<cw
> >> bm>,<cwbm>,<cwbm>,<cwbm>;…
> >>
> L2:<cache_id0>=>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;<cache_id1>=
> >> <cwbm>,<cwbm>,<cwbm>,<cwbm>;…
> >>
> >> ・L1: Add a new interface to control the L1D cache.
> >> ・<cwbm>,<cwbm>,<cwbm>,<cwbm>:Specify the number of ways for
> each
> >> sector.
> >> ・cwbm:Specify the number of ways in each sector as a bitmap
> (percentage),
> >>    but the bitmap does not indicate the location of the cache.
> >> * In the sector cache function, L2 sector cache way setting register is
> >>    shared among PEs (Processor Element) in shared domain. If two PEs
> >>    which share L2 cache belongs to different resource groups, one
> resource
> >>    group's L2 setting will affect to other resource group's L2 setting.
> 
> In resctrl a "resource group" can be viewed as a class of service.

Thanks for your explanation. Adding sector cache function into resctrl, 
I will use this mechanism of resctrl as it is.

> >> * Since A64FX does not support MPAM, it is not necessary to consider
> >>    how to switch between MPAM and sector cache function now.
> >>
> >> Some questions:
> >> 1.I'm still studying about RDT, could you tell me whether RDT has
> >>    the similar mechanism with sector cache function?
> 
> This is not clear to me yet. One thing to keep in mind is that a bit in the capacity
> bitmask could correspond to some number of ways in a cache, but it does not
> have to. It is essentially a hint to hardware on how much cache space needs to
> be allocated while also indicating overlap and isolation from other allocations.
> 
> resctrl already supports the bitmask being interpreted differently between
> architectures and with the MPAM support there will be even more support for
> different interpretations.

when adding sector cache function into resctrl, 
the bitmap will only show the maximum number of ways of sector 
and does not indicate cache position like in RDT. 
Sector is a group of cache ways, and one cache line cannot be assigned 
to different sectors at the same time. Different sectors have different 
cache space. When different tasks use different sectors, 
the cache space used can be isolated.

> >> 2.In RDT, L3 cache is shared among cores in socket. If two cores which
> >>    share L3 cache belongs to different resource groups, one resource
> >>    group's L3 setting will affect to other resource group's L3 setting?
> 
> This question is not entirely clear to me. Are you referring to the hardware layout
> or configuration changes via the resctrl "cpus" file?
> 
> Each resource group is a class of service (CLOS) that is supported by all cache
> instances. By default each resource group would thus contain all cache
> instances on the system (even if some cache instances do not support the
> same number of CLOS resctrl would only support the CLOS supported by all
> resources).

[Idea for A64FX sector cache function control]  
An example of using the sector function when working on resctrl 
as follows. 
  # mount -t resctrl resctrl /sys/fs/resctrl
  # cd /sys/fs/resctrl
  # mkdir p0
  # echo XXXX > /sys/fs/resctrl/p0/cpus *1 
  # echo “L1:0=000F,000F,000F,000F;1=000F,000F,000F,000F” > /sys/fs/resctrl/p0/schemata*2
  # echo “L2:0=000F,000F,0,0;1=0,0,000F,000F” > /sys/fs/resctrl/p0/schemata※2 
  # echo PID > sys/fs/resctrl/p0/tasks

*1 
   Since the A64FX L2 settings are shared by NUMA, all PEs (cores) 
   on the same NUMA should be specified at the same time. 
   In other words, we want to specify NUMAs instead of PEs to the 
   resource group. Maybe it is better to change the interface to 
   numas(/sys/fs/resctrl/p0/numas). Could you give me some advice? 
*2 
   L1:<cache_id0>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;<cache_id1>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;… 
   L2:<cache_id0>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;<cache_id1>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;… 
  ・ L1:
     Add a new interface to control the L1D cache. 
  ・ <cwbm>,<cwbm>,<cwbm>,<cwbm>:
     Specify the number of ways for each sector.  
     Each L1/L2 cache has 4 sectors.
     And it is needed to specify the number of ways for each sector. 
  ・ cwbm:
     Specify the number of ways in each sector as a bitmap (percentage), 
     but the bitmap does not indicate the position of the cache. 
     The range is from 0 way to 16 ways. 

When creating a resource group, the number of ways for 4 sectors of 
L1&L2 is set. When running a task, the [56:57] bits of virtual 
address are used for sector selection. Even different tasks running 
in the same resource group can use different sector caches. Therefore, 
when running a task that handles a large amount of infrequently used 
data and a task that handles a large amount of frequently used data 
at the same time, cache size limitation and cache space isolation can 
be performed, and cache thrashing also can be reduced. 

Is this approach acceptable? Could you give me some advice?  

Best regards,
Tan Shaopeng


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: About add an A64FX cache control function into resctrl
  2021-05-17  8:31         ` tan.shaopeng
@ 2021-05-21 17:44           ` Reinette Chatre
  2021-05-25  8:45             ` tan.shaopeng
  0 siblings, 1 reply; 20+ messages in thread
From: Reinette Chatre @ 2021-05-21 17:44 UTC (permalink / raw)
  To: tan.shaopeng, 'fenghua.yu@intel.com'
  Cc: 'linux-kernel@vger.kernel.org',
	'linux-arm-kernel@lists.infradead.org',
	'James Morse',
	misono.tomohiro, Luck, Tony

Hi Tan Shaopeng,

On 5/17/2021 1:31 AM, tan.shaopeng@fujitsu.com wrote:
> Hi Reinette,
> 
> I’m sorry for the late reply.
> I think I could not explain A64FX’s sector cache function well in
> my first mail. While answering the question, I will also explain
> this function in more detail. Though maybe you have already learned
> more about this function by reading specification and manual,
> in order to better understand this function, some contents may have
> duplicate explanations.
> 
>>>> The overview in section 12 was informative but very high level.
>>>
>>> I'm considering how to answer your questions from your email which I
>>> received before, when I check the email again, I am sorry that the
>>> information I provided before are insufficient.
>>>
>>> To understand the sector cache function of A64FX, could you please see
>>> A64FX_Microarchitecture_Manual - section 12. Sector Cache
>>>
>> https://github.com/fujitsu/A64FX/blob/master/doc/A64FX_Microarchitectu
>>> re_Manual_en_1.4.pdf
>>> and,
>>> A64FX_Specification_HPC_Extension ? section 1.2. Sector Cache
>>>
>> https://github.com/fujitsu/A64FX/blob/master/doc/A64FX_Specification_H
>>> PC_Extension_v1_EN.pdf
>>
>> Thank you for the direct links - I missed that there are two documents available.
>>
>> After reading the spec portion it does seem to me even more as though
>> "sectors" could be considered the same as the resctrl "classes of service". The
>> Fujitsu hardware supports four sectors that can be configured with different
>> number of ways using the registers you mention above. In resctrl this could be
>> considered as hardware that supports four classes of service and each class of
>> service can be allocated a different number of ways.
> 
> Fujitsu hardware supports four sectors that can be configured with
> different number of ways by using "IMP_SCCR" registers, and when this
> function is added into resctrl, the maximum ways of each sector are
> indicated by bitmap.
> 
> However, A64FX's L2 cache setting registers are shared among PEs
> (Processor Element) in NUMA. If two PEs in the same NUMA are assigned
> to different resource groups, changing one PE's L2 setting on one
> resource group, the other PE's L2 setting on other resource groups
> will be influenced. So, adding this function into resctrl, we will
> assign NUMA to the resource group. (On F64FX, each NUMA has 12 PEs,
> and each PE has L1 cache setting registers, but these registers are
> not shared.) There are 4 NUMAs on A64FX, 4 NUMAs could be considered
> as hardware that supports four classes of service at most, and each
> class of service has 4 sectors (4 L1 sectors& 4 L2 sectors),
> and each sector can be allocated a different number of ways.
> And, when a running task on resource group, the [56:57] bits of
> virtual address are used for sector selection (cache affinity).

It is not clear to me why NUMA needs to be involved.

Processors sharing a cache, either L2 or L3 cache, is familiar and well  
supported by resctrl.

My understanding of the sector cache feature is that each cache can be  
split into multiple (4) sectors. It thus seems to me something specific  
to the cache itself.

Let me try and give an example of my understanding based on the cache  
architecture described in the A64FX Microarchitecture Manual.

I see in Figure 9-2 that each processor has an L1D as well as L1I Cache,  
and twelve processors share an L2 cache. The L1D cache has 4 ways (0xF  
bitmask) and L2 cache has 16 (0xFFFF bitmask) ways. From what I  
understand the sector cache function is supported on L1D and L2.

First, the goal would be to discover all the caches on the system -  
since it is the sectors need to be programmed on each cache. On the  
system with 48 cores there would thus be 48 L1D caches, and 4 L2 caches.

Let's start by assigning the caches IDs: the L1D caches are numbered  
from 0 to 47 and the L2 caches numbered from 0 to 3.

My understanding is that the goal is to program these sectors using  
resctrl. Each cache instance can have maximum four sectors, they cannot  
overlap. (I do not know if each sector has to have some portion of cache  
associated with it or if a sector is allowed to be "empty").

So, what is needed is, for example, to have a way to say: "sector 0 on  
cache L1D with id X is assigned Y ways", "sector 1 on cache L2 with id Z  
is assigned XX ways". Is this correct?

If my understanding is correct then you can do this with resctrl as  
follows (I am making many assumptions on behavior here, especially  
regarding how many ways a sector is required to have, but I hope this  
could be a baseline to evaluate and correct my understanding and build  
on how this could be supported):

On boot all cache ways on all cache instances belong to sector 0:

# cd /sys/fs/resctrl/
# cat schemata
L1D:0=0xf;1=0xf;2=0xf;.....;47=0xf
L2:0=0xffff;1=0xffff;2=0xffff;3=0xffff

Create sector2 and assign half of all cache ways to it:
(In support of this it would be required that resctrl resource groups  
are exclusive. Exclusive resource groups are already supported but not  
the default as it needed here.)

First, to provide cache ways to sector 1, the cache ways needs to be  
removed from sector 0:
(I am not sure if specific ways can be assigned to a sector or just a  
number of ways, both could be supported)
# echo 'L1D:0=0x3;1=0x3;...;47=0x3' > /sys/fs/resctrl/schemata
# echo 'L2:0=0xff;1=0xff;2=0xff;3=0xff'> /sys/fs/resctrl/schemata

Now create sector2 (alternatively all sectors could exist on boot for  
this system):
# mkdir /sys/fs/resctrl/sector2
# echo 'L1D:0=0x3;1=0x3;...;47=0x3' > /sys/fs/resctrl/sector2/schemata
# echo 'L2:0=0xff;1=0xff;2=0xff;3=0xff'> /sys/fs/resctrl/sector2/schemata

At this point there are two sectors configured. Configuration of sector0  
can be found in /sys/fs/resctrl/schemata and configuration of sector1 in  
/sys/fs/resctrl/sector1/schemata

>> The other part is how hardware knows which sector is being used at any
>> moment in time. In resctrl that is programmed by writing the active class of
>> service into needed register at the time the application is context switched
>> (resctrl_sched_in()). This seems different here since as you describe the
>> sector is chosen by bits in the address. Even so, which bits to set in the
>> address needs to be programmed also and I also understand that there is a
>> "default" sector that can be programmed via register. Could these be equivalent
>> to what is done currently in resctrl?
> 
> Adding this function into resctrl, there is no need to write active
> class of service into needed register. When running a task, the sector
> id is decided by [56:57] bits of virtual address, and these bits are
> programed by users. When creating a resource group, the maximum number
> of ways of each sector are set by "IMP_SCCR" setting registers.
> As long as the task is running in a certain resource group, the sector
> and the maximum number of ways of sectors are used will not be changed.
> Therefore, we need not consider context switches on A64FX.
> 

The current interface would associate a "tasks" file with each sector to  
indicate which tasks run with the particular sector id. I thought there  
was a way to program the default sector id in a register, which is  
something that could be done when a task is context switched in.  
Otherwise there would need to be some re-architecting to remove the  
"tasks" association. This would be a significant change.

Reinette


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: About add an A64FX cache control function into resctrl
  2021-05-21 17:44           ` Reinette Chatre
@ 2021-05-25  8:45             ` tan.shaopeng
  2021-05-26 17:36               ` Reinette Chatre
  0 siblings, 1 reply; 20+ messages in thread
From: tan.shaopeng @ 2021-05-25  8:45 UTC (permalink / raw)
  To: 'Reinette Chatre', 'fenghua.yu@intel.com'
  Cc: 'linux-kernel@vger.kernel.org',
	'linux-arm-kernel@lists.infradead.org',
	'James Morse',
	misono.tomohiro, Luck, Tony

Hi Reinette,

Sorry, I have not explained A64FX's sector cache function well yet.
I think I need explain this function from different perspective.

> On 5/17/2021 1:31 AM, tan.shaopeng@fujitsu.com wrote:
> > Hi Reinette,
> >
> > I’m sorry for the late reply.
> > I think I could not explain A64FX’s sector cache function well in my
> > first mail. While answering the question, I will also explain this
> > function in more detail. Though maybe you have already learned more
> > about this function by reading specification and manual, in order to
> > better understand this function, some contents may have duplicate
> > explanations.
> >
> >>>> The overview in section 12 was informative but very high level.
> >>>
> >>> I'm considering how to answer your questions from your email which I
> >>> received before, when I check the email again, I am sorry that the
> >>> information I provided before are insufficient.
> >>>
> >>> To understand the sector cache function of A64FX, could you please
> >>> see A64FX_Microarchitecture_Manual - section 12. Sector Cache
> >>>
> >>
> https://github.com/fujitsu/A64FX/blob/master/doc/A64FX_Microarchitect
> >> u
> >>> re_Manual_en_1.4.pdf
> >>> and,
> >>> A64FX_Specification_HPC_Extension ? section 1.2. Sector Cache
> >>>
> >>
> https://github.com/fujitsu/A64FX/blob/master/doc/A64FX_Specification_
> >> H
> >>> PC_Extension_v1_EN.pdf
> >>
> >> Thank you for the direct links - I missed that there are two documents
> available.
> >>
> >> After reading the spec portion it does seem to me even more as though
> >> "sectors" could be considered the same as the resctrl "classes of
> >> service". The Fujitsu hardware supports four sectors that can be
> >> configured with different number of ways using the registers you
> >> mention above. In resctrl this could be considered as hardware that
> >> supports four classes of service and each class of service can be allocated
> a different number of ways.
> >
> > Fujitsu hardware supports four sectors that can be configured with
> > different number of ways by using "IMP_SCCR" registers, and when this
> > function is added into resctrl, the maximum ways of each sector are
> > indicated by bitmap.
> >
> > However, A64FX's L2 cache setting registers are shared among PEs
> > (Processor Element) in NUMA. If two PEs in the same NUMA are assigned
> > to different resource groups, changing one PE's L2 setting on one
> > resource group, the other PE's L2 setting on other resource groups
> > will be influenced. So, adding this function into resctrl, we will
> > assign NUMA to the resource group. (On F64FX, each NUMA has 12 PEs,
> > and each PE has L1 cache setting registers, but these registers are
> > not shared.) There are 4 NUMAs on A64FX, 4 NUMAs could be considered
> > as hardware that supports four classes of service at most, and each
> > class of service has 4 sectors (4 L1 sectors& 4 L2 sectors), and each
> > sector can be allocated a different number of ways.
> > And, when a running task on resource group, the [56:57] bits of
> > virtual address are used for sector selection (cache affinity).
> 
> It is not clear to me why NUMA needs to be involved.
> 
> Processors sharing a cache, either L2 or L3 cache, is familiar and well
> supported by resctrl.
> 
> My understanding of the sector cache feature is that each cache can be split
> into multiple (4) sectors. It thus seems to me something specific to the cache
> itself.
> 
> Let me try and give an example of my understanding based on the cache
> architecture described in the A64FX Microarchitecture Manual.
> 
> I see in Figure 9-2 that each processor has an L1D as well as L1I Cache, and
> twelve processors share an L2 cache. The L1D cache has 4 ways (0xF
> bitmask) and L2 cache has 16 (0xFFFF bitmask) ways. From what I understand
> the sector cache function is supported on L1D and L2.
> 
> First, the goal would be to discover all the caches on the system - since it is the
> sectors need to be programmed on each cache. On the system with 48 cores
> there would thus be 48 L1D caches, and 4 L2 caches.
> 
> Let's start by assigning the caches IDs: the L1D caches are numbered from 0 to
> 47 and the L2 caches numbered from 0 to 3.
> 
> My understanding is that the goal is to program these sectors using resctrl.
> Each cache instance can have maximum four sectors, they cannot overlap. (I do
> not know if each sector has to have some portion of cache associated with it or
> if a sector is allowed to be "empty").
> 
> So, what is needed is, for example, to have a way to say: "sector 0 on cache L1D
> with id X is assigned Y ways", "sector 1 on cache L2 with id Z is assigned XX
> ways". Is this correct?
> 
> If my understanding is correct then you can do this with resctrl as follows (I am
> making many assumptions on behavior here, especially regarding how many
> ways a sector is required to have, but I hope this could be a baseline to evaluate
> and correct my understanding and build on how this could be supported):
> 
> On boot all cache ways on all cache instances belong to sector 0:
> 
> # cd /sys/fs/resctrl/
> # cat schemata
> L1D:0=0xf;1=0xf;2=0xf;.....;47=0xf
> L2:0=0xffff;1=0xffff;2=0xffff;3=0xffff
> 
> Create sector2 and assign half of all cache ways to it:
> (In support of this it would be required that resctrl resource groups are
> exclusive. Exclusive resource groups are already supported but not the default
> as it needed here.)
> 
> First, to provide cache ways to sector 1, the cache ways needs to be removed
> from sector 0:
> (I am not sure if specific ways can be assigned to a sector or just a number of
> ways, both could be supported) # echo 'L1D:0=0x3;1=0x3;...;47=0x3' >
> /sys/fs/resctrl/schemata # echo 'L2:0=0xff;1=0xff;2=0xff;3=0xff'>
> /sys/fs/resctrl/schemata
> 
> Now create sector2 (alternatively all sectors could exist on boot for this
> system):
> # mkdir /sys/fs/resctrl/sector2
> # echo 'L1D:0=0x3;1=0x3;...;47=0x3' > /sys/fs/resctrl/sector2/schemata #
> echo 'L2:0=0xff;1=0xff;2=0xff;3=0xff'> /sys/fs/resctrl/sector2/schemata
> 
> At this point there are two sectors configured. Configuration of sector0 can be
> found in /sys/fs/resctrl/schemata and configuration of sector1 in
> /sys/fs/resctrl/sector1/schemata
> 
> >> The other part is how hardware knows which sector is being used at
> >> any moment in time. In resctrl that is programmed by writing the
> >> active class of service into needed register at the time the
> >> application is context switched (resctrl_sched_in()). This seems
> >> different here since as you describe the sector is chosen by bits in
> >> the address. Even so, which bits to set in the address needs to be
> >> programmed also and I also understand that there is a "default"
> >> sector that can be programmed via register. Could these be equivalent to
> what is done currently in resctrl?
> >
> > Adding this function into resctrl, there is no need to write active
> > class of service into needed register. When running a task, the sector
> > id is decided by [56:57] bits of virtual address, and these bits are
> > programed by users. When creating a resource group, the maximum number
> > of ways of each sector are set by "IMP_SCCR" setting registers.
> > As long as the task is running in a certain resource group, the sector
> > and the maximum number of ways of sectors are used will not be changed.
> > Therefore, we need not consider context switches on A64FX.
> >
> 
> The current interface would associate a "tasks" file with each sector to indicate
> which tasks run with the particular sector id. I thought there was a way to
> program the default sector id in a register, which is something that could be
> done when a task is context switched in.
> Otherwise there would need to be some re-architecting to remove the "tasks"
> association. This would be a significant change.

--------
A64FX NUMA-PE-Cache Architecture:
NUMA0:
  PE0:
    L1sector0,L1sector1,L1sector2,L1sector3
  PE1:
    L1sector0,L1sector1,L1sector2,L1sector3
  ...
  PE11:
    L1sector0,L1sector1,L1sector2,L1sector3
  
  L2sector0,1/L2sector2,3
NUMA1:
  PE0:
    L1sector0,L1sector1,L1sector2,L1sector3
  ...
  PE11:
    L1sector0,L1sector1,L1sector2,L1sector3
  
  L2sector0,1/L2sector2,3
NUMA2:
  ...
NUMA3:
  ...
--------
In A64FX processor, one L1 sector cache capacity setting register is 
only for one PE and not shared among PEs. L2 sector cache maximum 
capacity setting registers are shared among PEs in same NUMA, and it is 
to be noted that changing these registers in one PE influences other PE. 
The number of ways for L2 Sector ID (0,1 or 2,3) can be set through 
any PEs in same NUMA. The sector ID 0,1 and 2,3 are not available at 
the same time in same NUMA.


I think, in your idea, a resource group will be created for each sector ID.
(> "sectors" could be considered the same as the resctrl "classes of service")
Then, an example of resource group is created as follows.
・ L1: NUMAX-PEY-L1sector0 (X = 0,1,2,3.Y = 0,1,2 ... 11),
・ L2: NUMAX-L2sector0 (X = 0,1,2,3)

In this example, sector with same ID(0) of all PEs is allocated to 
resource group. The L1D caches are numbered from NUMA0_PE0-L1sector0(0)
to NUMA4_PE11-L1sector0(47) and the L2 caches numbered from 
NUMA0-L2sector0(0) to NUM4-L2sector0(3). 
(NUMA number X is from 0-4, PE number Y is from 0-11)
(1) The number of ways of NUMAX-PEY-L1sector0 can be set independently
    for each PEs (0-47). When run a task on this resource group, 
    we cannot control on which PE the task is running on and how many 
    cache ways the task is using. 
(2) Since L2 can only use 2 sectors at a time, when creating more than
    2 resource groups, L2setctor0 will have to be allocated to a 
    different resource group. If the L2sector0 is shared by different 
    resource groups, the L2 sector settings on resource group will be 
    influenced by each other.
etc... there are various problems, and no merit to using resctrl.


In my idea, in order to allocate the L1 and L2 cache to a resource 
group, allocate NUMA to the resource group.
An example of resource group is as follows.
・ NUMA0-PEY-L1sectorZ (Y = 0,1,2...11. Z = 0,1,2,3)
・ NUMA0-L2sectorZZ (ZZ = 0,1,2,3)

  #cat /sys/fs/resctrl/p0/cpus
  0-11 *1
  #cat /sys/fs/resctrl/p0/schemata
  L1:0=0xF,0x3,0x1,x0x0 *2
  L2:0=0xFFF,0xF,0,0 *3

*1: PEs belong one NUMA. (Of course, multiple NUMAs can also be 
    specified in one resource group)
*2: The number of ways for L1sector0,1,2,3. On this resource group 
    the number of ways of all sector0 is the same(0xF). If 0 way is 
    specified for one sector, this sector cannot be used. If 4(0xF) 
    ways are specified for one sector, this sector can use cache fully.
    If 4 ways are specified for each sector, there will be no 
    restriction for using cache.
*3: The number of ways for L2 sector 0,1. If L2sector0,1 is used, 
    the number of ways of L2sector2,3 must be set to 0.

All sectors with the same ID on the same resource group were set to 
the same number of ways, and when running a task on A64FX, the sector 
ID used by task is determined by [56:57] bits of virtual address. 
By specifying the PID to /sys/fs/resctrl/tasks, the task will be bound 
to the resource group, and then, the cache size used by task will not 
be changed never.


Best regards,
Tan Shaopeng

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: About add an A64FX cache control function into resctrl
  2021-05-25  8:45             ` tan.shaopeng
@ 2021-05-26 17:36               ` Reinette Chatre
  2021-05-27  8:45                 ` tan.shaopeng
  2021-07-07 11:26                 ` tan.shaopeng
  0 siblings, 2 replies; 20+ messages in thread
From: Reinette Chatre @ 2021-05-26 17:36 UTC (permalink / raw)
  To: tan.shaopeng, 'fenghua.yu@intel.com'
  Cc: 'linux-kernel@vger.kernel.org',
	'linux-arm-kernel@lists.infradead.org',
	'James Morse',
	misono.tomohiro, Luck, Tony

Hi Tan Shaopeng,

On 5/25/2021 1:45 AM, tan.shaopeng@fujitsu.com wrote:
> Hi Reinette,
> 
> Sorry, I have not explained A64FX's sector cache function well yet.
> I think I need explain this function from different perspective.

You have explained the A64FX's sector cache function well. I have also  
read both specs to understand it better. It appears to me that you are  
not considering the resctrl architecture as part of your solution but  
instead just forcing your architecture onto the resctrl filesystem. For  
example, in resctrl the resource groups are not just a directory  
structure but has significance in what is being represented within the  
directory (a class of service). The files within a resource group's  
directory build on that. From your side I have not seen any effort in  
aligning the sector cache function with the resctrl architecture but  
instead you are just changing resctrl interface to match the A64FX  
architecture.

Could you please take a moment to understand what resctrl is and how it  
could be mapped to A64FX in a coherent way?

> 
>> On 5/17/2021 1:31 AM, tan.shaopeng@fujitsu.com wrote:

> --------
> A64FX NUMA-PE-Cache Architecture:
> NUMA0:
>    PE0:
>      L1sector0,L1sector1,L1sector2,L1sector3
>    PE1:
>      L1sector0,L1sector1,L1sector2,L1sector3
>    ...
>    PE11:
>      L1sector0,L1sector1,L1sector2,L1sector3
>    
>    L2sector0,1/L2sector2,3
> NUMA1:
>    PE0:
>      L1sector0,L1sector1,L1sector2,L1sector3
>    ...
>    PE11:
>      L1sector0,L1sector1,L1sector2,L1sector3
>    
>    L2sector0,1/L2sector2,3
> NUMA2:
>    ...
> NUMA3:
>    ...
> --------
> In A64FX processor, one L1 sector cache capacity setting register is
> only for one PE and not shared among PEs. L2 sector cache maximum
> capacity setting registers are shared among PEs in same NUMA, and it is
> to be noted that changing these registers in one PE influences other PE.

Understood. cache affinity is familiar to resctrl. When a CPU becomes  
online it is discovered which caches/resources it has affinity to.  
Resources then have CPU mask associated with them to indicate on which  
CPU a register could be changed to configure the resource/cache. See  
domain_add_cpu() and struct rdt_domain.

> The number of ways for L2 Sector ID (0,1 or 2,3) can be set through
> any PEs in same NUMA. The sector ID 0,1 and 2,3 are not available at
> the same time in same NUMA.
> 
> 
> I think, in your idea, a resource group will be created for each sector ID.
> (> "sectors" could be considered the same as the resctrl "classes of service")
> Then, an example of resource group is created as follows.
> ・ L1: NUMAX-PEY-L1sector0 (X = 0,1,2,3.Y = 0,1,2 ... 11),
> ・ L2: NUMAX-L2sector0 (X = 0,1,2,3)
> 
> In this example, sector with same ID(0) of all PEs is allocated to
> resource group. The L1D caches are numbered from NUMA0_PE0-L1sector0(0)
> to NUMA4_PE11-L1sector0(47) and the L2 caches numbered from
> NUMA0-L2sector0(0) to NUM4-L2sector0(3).
> (NUMA number X is from 0-4, PE number Y is from 0-11)
> (1) The number of ways of NUMAX-PEY-L1sector0 can be set independently
>      for each PEs (0-47). When run a task on this resource group,
>      we cannot control on which PE the task is running on and how many
>      cache ways the task is using.

resctrl does not control the affinity on which PE/CPU a task is run.  
resctrl is an interface with which to configure how resources are  
allocated on the system. resctrl could thus provide interface with which  
each sector of each cache instance is assigned a number of cache ways.  
resctrl also provides an interface to assign a task with a class of  
service (sector id?). Through this the task obtains access to all  
resources that is allocated to the particular class of service (sector  
id?). Depending on which CPU the task is running it may indeed  
experience different performance if the sector id it is running with  
does not have the same allocations on all cache instances. The affinity  
of the task needs to be managed separately using for example taskset.  
Please see Documentation/x86/resctrl.rst "Examples for RDT allocation usage"

> (2) Since L2 can only use 2 sectors at a time, when creating more than
>      2 resource groups, L2setctor0 will have to be allocated to a
>      different resource group. If the L2sector0 is shared by different
>      resource groups, the L2 sector settings on resource group will be
>      influenced by each other.
> etc... there are various problems, and no merit to using resctrl.
> 
> 
> In my idea, in order to allocate the L1 and L2 cache to a resource
> group, allocate NUMA to the resource group.
> An example of resource group is as follows.
> ・ NUMA0-PEY-L1sectorZ (Y = 0,1,2...11. Z = 0,1,2,3)
> ・ NUMA0-L2sectorZZ (ZZ = 0,1,2,3)
> 
>    #cat /sys/fs/resctrl/p0/cpus
>    0-11 *1
>    #cat /sys/fs/resctrl/p0/schemata
>    L1:0=0xF,0x3,0x1,x0x0 *2
>    L2:0=0xFFF,0xF,0,0 *3
> 
> *1: PEs belong one NUMA. (Of course, multiple NUMAs can also be
>      specified in one resource group)
> *2: The number of ways for L1sector0,1,2,3. On this resource group
>      the number of ways of all sector0 is the same(0xF). If 0 way is
>      specified for one sector, this sector cannot be used. If 4(0xF)
>      ways are specified for one sector, this sector can use cache fully.
>      If 4 ways are specified for each sector, there will be no
>      restriction for using cache.
> *3: The number of ways for L2 sector 0,1. If L2sector0,1 is used,
>      the number of ways of L2sector2,3 must be set to 0.
> 
> All sectors with the same ID on the same resource group were set to
> the same number of ways, and when running a task on A64FX, the sector
> ID used by task is determined by [56:57] bits of virtual address.
> By specifying the PID to /sys/fs/resctrl/tasks, the task will be bound
> to the resource group, and then, the cache size used by task will not
> be changed never.

This completely ignores how this directory and files are currently used.  
What is missing how this implementation maps to the current resctrl  
architecture.

Reinette



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: About add an A64FX cache control function into resctrl
  2021-05-26 17:36               ` Reinette Chatre
@ 2021-05-27  8:45                 ` tan.shaopeng
  2021-07-07 11:26                 ` tan.shaopeng
  1 sibling, 0 replies; 20+ messages in thread
From: tan.shaopeng @ 2021-05-27  8:45 UTC (permalink / raw)
  To: 'Reinette Chatre', 'fenghua.yu@intel.com'
  Cc: 'linux-kernel@vger.kernel.org',
	'linux-arm-kernel@lists.infradead.org',
	'James Morse',
	misono.tomohiro, Luck, Tony

Hi Reinette,

> On 5/25/2021 1:45 AM, tan.shaopeng@fujitsu.com wrote:
> > Hi Reinette,
> >
> > Sorry, I have not explained A64FX's sector cache function well yet.
> > I think I need explain this function from different perspective.
> 
> You have explained the A64FX's sector cache function well. I have also read
> both specs to understand it better. It appears to me that you are not considering
> the resctrl architecture as part of your solution but instead just forcing your
> architecture onto the resctrl filesystem. For example, in resctrl the resource
> groups are not just a directory structure but has significance in what is being
> represented within the directory (a class of service). The files within a resource
> group's directory build on that. From your side I have not seen any effort in
> aligning the sector cache function with the resctrl architecture but instead you
> are just changing resctrl interface to match the A64FX architecture.
> 
> Could you please take a moment to understand what resctrl is and how it could
> be mapped to A64FX in a coherent way?

Thanks for your mail.
Sorry, I’m wrong in understanding how to use 
/sys/fs/resctrl/p0/cpus and /sys/fs/resctrl/tasks. 
I think I have not understood resctrl yet, and I will learn more about it.
If I have questions, please allow me to mail you.

> >> On 5/17/2021 1:31 AM, tan.shaopeng@fujitsu.com wrote:
> 
> > --------
> > A64FX NUMA-PE-Cache Architecture:
> > NUMA0:
> >    PE0:
> >      L1sector0,L1sector1,L1sector2,L1sector3
> >    PE1:
> >      L1sector0,L1sector1,L1sector2,L1sector3
> >    ...
> >    PE11:
> >      L1sector0,L1sector1,L1sector2,L1sector3
> >
> >    L2sector0,1/L2sector2,3
> > NUMA1:
> >    PE0:
> >      L1sector0,L1sector1,L1sector2,L1sector3
> >    ...
> >    PE11:
> >      L1sector0,L1sector1,L1sector2,L1sector3
> >
> >    L2sector0,1/L2sector2,3
> > NUMA2:
> >    ...
> > NUMA3:
> >    ...
> > --------
> > In A64FX processor, one L1 sector cache capacity setting register is
> > only for one PE and not shared among PEs. L2 sector cache maximum
> > capacity setting registers are shared among PEs in same NUMA, and it
> > is to be noted that changing these registers in one PE influences other PE.
> 
> Understood. cache affinity is familiar to resctrl. When a CPU becomes online it
> is discovered which caches/resources it has affinity to.
> Resources then have CPU mask associated with them to indicate on which
> CPU a register could be changed to configure the resource/cache. See
> domain_add_cpu() and struct rdt_domain.
> 
> > The number of ways for L2 Sector ID (0,1 or 2,3) can be set through
> > any PEs in same NUMA. The sector ID 0,1 and 2,3 are not available at
> > the same time in same NUMA.
> >
> >
> > I think, in your idea, a resource group will be created for each sector ID.
> > (> "sectors" could be considered the same as the resctrl "classes of
> > service") Then, an example of resource group is created as follows.
> > ・ L1: NUMAX-PEY-L1sector0 (X = 0,1,2,3.Y = 0,1,2 ... 11),
> > ・ L2: NUMAX-L2sector0 (X = 0,1,2,3)
> >
> > In this example, sector with same ID(0) of all PEs is allocated to
> > resource group. The L1D caches are numbered from
> > NUMA0_PE0-L1sector0(0) to NUMA4_PE11-L1sector0(47) and the L2
> caches
> > numbered from
> > NUMA0-L2sector0(0) to NUM4-L2sector0(3).
> > (NUMA number X is from 0-4, PE number Y is from 0-11)
> > (1) The number of ways of NUMAX-PEY-L1sector0 can be set independently
> >      for each PEs (0-47). When run a task on this resource group,
> >      we cannot control on which PE the task is running on and how many
> >      cache ways the task is using.
> 
> resctrl does not control the affinity on which PE/CPU a task is run.
> resctrl is an interface with which to configure how resources are allocated on
> the system. resctrl could thus provide interface with which each sector of each
> cache instance is assigned a number of cache ways.
> resctrl also provides an interface to assign a task with a class of service (sector
> id?). Through this the task obtains access to all resources that is allocated to
> the particular class of service (sector id?). Depending on which CPU the task is
> running it may indeed experience different performance if the sector id it is
> running with does not have the same allocations on all cache instances. The
> affinity of the task needs to be managed separately using for example taskset.
> Please see Documentation/x86/resctrl.rst "Examples for RDT allocation usage"
> 
> > (2) Since L2 can only use 2 sectors at a time, when creating more than
> >      2 resource groups, L2setctor0 will have to be allocated to a
> >      different resource group. If the L2sector0 is shared by different
> >      resource groups, the L2 sector settings on resource group will be
> >      influenced by each other.
> > etc... there are various problems, and no merit to using resctrl.
> >
> >
> > In my idea, in order to allocate the L1 and L2 cache to a resource
> > group, allocate NUMA to the resource group.
> > An example of resource group is as follows.
> > ・ NUMA0-PEY-L1sectorZ (Y = 0,1,2...11. Z = 0,1,2,3)
> > ・ NUMA0-L2sectorZZ (ZZ = 0,1,2,3)
> >
> >    #cat /sys/fs/resctrl/p0/cpus
> >    0-11 *1
> >    #cat /sys/fs/resctrl/p0/schemata
> >    L1:0=0xF,0x3,0x1,x0x0 *2
> >    L2:0=0xFFF,0xF,0,0 *3
> >
> > *1: PEs belong one NUMA. (Of course, multiple NUMAs can also be
> >      specified in one resource group)
> > *2: The number of ways for L1sector0,1,2,3. On this resource group
> >      the number of ways of all sector0 is the same(0xF). If 0 way is
> >      specified for one sector, this sector cannot be used. If 4(0xF)
> >      ways are specified for one sector, this sector can use cache fully.
> >      If 4 ways are specified for each sector, there will be no
> >      restriction for using cache.
> > *3: The number of ways for L2 sector 0,1. If L2sector0,1 is used,
> >      the number of ways of L2sector2,3 must be set to 0.
> >
> > All sectors with the same ID on the same resource group were set to
> > the same number of ways, and when running a task on A64FX, the sector
> > ID used by task is determined by [56:57] bits of virtual address.
> > By specifying the PID to /sys/fs/resctrl/tasks, the task will be bound
> > to the resource group, and then, the cache size used by task will not
> > be changed never.
> 
> This completely ignores how this directory and files are currently used.
> What is missing how this implementation maps to the current resctrl
> architecture.

Best regards,
Tan Shaopeng

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: About add an A64FX cache control function into resctrl
  2021-05-26 17:36               ` Reinette Chatre
  2021-05-27  8:45                 ` tan.shaopeng
@ 2021-07-07 11:26                 ` tan.shaopeng
  2021-07-16  0:49                   ` tan.shaopeng
  2021-07-19 23:25                   ` Reinette Chatre
  1 sibling, 2 replies; 20+ messages in thread
From: tan.shaopeng @ 2021-07-07 11:26 UTC (permalink / raw)
  To: 'Reinette Chatre', 'fenghua.yu@intel.com'
  Cc: 'linux-kernel@vger.kernel.org',
	'linux-arm-kernel@lists.infradead.org',
	'James Morse',
	misono.tomohiro, Luck, Tony

Hi Reinette,

> > Sorry, I have not explained A64FX's sector cache function well yet.
> > I think I need explain this function from different perspective.
> 
> You have explained the A64FX's sector cache function well. I have also read
> both specs to understand it better. It appears to me that you are not considering
> the resctrl architecture as part of your solution but instead just forcing your
> architecture onto the resctrl filesystem. For example, in resctrl the resource
> groups are not just a directory structure but has significance in what is being
> represented within the directory (a class of service). The files within a resource
> group's directory build on that. From your side I have not seen any effort in
> aligning the sector cache function with the resctrl architecture but instead you
> are just changing resctrl interface to match the A64FX architecture.
> 
> Could you please take a moment to understand what resctrl is and how it could
> be mapped to A64FX in a coherent way?

Previously, my idea is based on how to make instructions use different
sectors in one task. After I studied resctrl, to utilize resctrl 
architecture on A64FX, I think it’s better to assign one sector to 
one task. Thanks for your idea that "sectors" could be considered the 
same as the resctrl "classes of service".

Based on your idea, I am considering the implementation details. 
In this email, I will explain the outline of new proposal, and then 
please allow me to confirm a few technologies about resctrl.

The outline of my proposal is as follows.
- Add a sector function equivalent to Intel's CAT function into resctrl. 
  (divide shared L2 cache into multiple partitions for multiple cores use)
- Allocate one sector to one resource group (one CLOSID). Since one 
  core can only be assigned to one resource group, on A64FX each core
  only uses one sector at a time. 
- Disable A64FX's HPC tag address override function. We only set each 
  core's default sector value according to closid(default sector ID=CLOSID).
- No L1 cache control since L1 cache is not shared for cores. It is not 
  necessary to add L1 cache interface for schemata file. 
- No need to update schemata interface. Resctrl's L2 cache interface 
  (L2: <cache_id0> = <cbm>; <cache_id1> = <cbm>; ...) 
  will be used as it is. However, on A64FX, <cbm> does not indicate 
  the position of cache partition, only indicate the number of 
  cache ways (size).

This is the smallest start of incorporating sector cache function into 
resctrl. I will consider if we could add more sector cache features 
into resctrl (e.g. selecting different sectors from one task) after 
finishing this.

(some questions are below)

> >
> >> On 5/17/2021 1:31 AM, tan.shaopeng@fujitsu.com wrote:
> 
> > --------
> > A64FX NUMA-PE-Cache Architecture:
> > NUMA0:
> >    PE0:
> >      L1sector0,L1sector1,L1sector2,L1sector3
> >    PE1:
> >      L1sector0,L1sector1,L1sector2,L1sector3
> >    ...
> >    PE11:
> >      L1sector0,L1sector1,L1sector2,L1sector3
> >
> >    L2sector0,1/L2sector2,3
> > NUMA1:
> >    PE0:
> >      L1sector0,L1sector1,L1sector2,L1sector3
> >    ...
> >    PE11:
> >      L1sector0,L1sector1,L1sector2,L1sector3
> >
> >    L2sector0,1/L2sector2,3
> > NUMA2:
> >    ...
> > NUMA3:
> >    ...
> > --------
> > In A64FX processor, one L1 sector cache capacity setting register is
> > only for one PE and not shared among PEs. L2 sector cache maximum
> > capacity setting registers are shared among PEs in same NUMA, and it
> > is to be noted that changing these registers in one PE influences other PE.
> 
> Understood. cache affinity is familiar to resctrl. When a CPU becomes online it
> is discovered which caches/resources it has affinity to.
> Resources then have CPU mask associated with them to indicate on which
> CPU a register could be changed to configure the resource/cache. See
> domain_add_cpu() and struct rdt_domain.

Is the following understanding correct?
Struct rdt_domain is a group of online CPUs that share a same cache 
instance. When a CPU is online(resctrl initialization), 
the domain_add_cpu() function add the online cpu to corresponding 
rdt_domain (in rdt_resource:domains list). For example, if there are
4 L2 cache instances, then there will be 4 rdt_domain in the list and
each CPU is assigned to corresponding rdt_domain.

The set values of cache/memory are stored in the *ctrl_val array
(indexed by CLOSID) of struct rdt_domain. For example, in CAT function, 
the CBM value of CLOSID=x is stored in ctrl_val [x].
When we create a resource group and write set values of cache into 
the schemata file, the update_domains() function updates the CBM value
to ctrl_val [CLOSID = resource group ID] in rdt_domain and updates the
CBM value to CBM register(MSR_IA32_Lx_CBM_BASE).

> > The number of ways for L2 Sector ID (0,1 or 2,3) can be set through
> > any PEs in same NUMA. The sector ID 0,1 and 2,3 are not available at
> > the same time in same NUMA.
> >
> >
> > I think, in your idea, a resource group will be created for each sector ID.
> > (> "sectors" could be considered the same as the resctrl "classes of
> > service") Then, an example of resource group is created as follows.
> > ・ L1: NUMAX-PEY-L1sector0 (X = 0,1,2,3.Y = 0,1,2 ... 11),
> > ・ L2: NUMAX-L2sector0 (X = 0,1,2,3)
> >
> > In this example, sector with same ID(0) of all PEs is allocated to
> > resource group. The L1D caches are numbered from
> > NUMA0_PE0-L1sector0(0) to NUMA4_PE11-L1sector0(47) and the L2
> caches
> > numbered from
> > NUMA0-L2sector0(0) to NUM4-L2sector0(3).
> > (NUMA number X is from 0-4, PE number Y is from 0-11)
> > (1) The number of ways of NUMAX-PEY-L1sector0 can be set independently
> >      for each PEs (0-47). When run a task on this resource group,
> >      we cannot control on which PE the task is running on and how many
> >      cache ways the task is using.
> 
> resctrl does not control the affinity on which PE/CPU a task is run.
> resctrl is an interface with which to configure how resources are allocated on
> the system. resctrl could thus provide interface with which each sector of each
> cache instance is assigned a number of cache ways.
> resctrl also provides an interface to assign a task with a class of service (sector
> id?). Through this the task obtains access to all resources that is allocated to
> the particular class of service (sector id?). Depending on which CPU the task is
> running it may indeed experience different performance if the sector id it is
> running with does not have the same allocations on all cache instances. The
> affinity of the task needs to be managed separately using for example taskset.
> Please see Documentation/x86/resctrl.rst "Examples for RDT allocation usage"

In resctrl_sched_in(), there are comments as follow:
  /*
 * If this task has a closid/rmid assigned, use it.
  * Else use the closid/rmid assigned to this cpu.
  */
I thought when we write PID to tasks file, this task (PID) will only 
run on the CPUs which are specified in cpus file in the same resource 
group. So, the task_struct's closid and cpu's closid is the same. 
When task's closid is different from cpu's closid?


Best regards,
Tan Shaopeng


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: About add an A64FX cache control function into resctrl
  2021-07-07 11:26                 ` tan.shaopeng
@ 2021-07-16  0:49                   ` tan.shaopeng
  2021-07-19 23:25                   ` Reinette Chatre
  1 sibling, 0 replies; 20+ messages in thread
From: tan.shaopeng @ 2021-07-16  0:49 UTC (permalink / raw)
  To: 'Reinette Chatre', 'fenghua.yu@intel.com'
  Cc: 'linux-kernel@vger.kernel.org',
	'linux-arm-kernel@lists.infradead.org',
	'James Morse', misono.tomohiro, 'Luck, Tony'

Hi Reinette,

> > > Sorry, I have not explained A64FX's sector cache function well yet.
> > > I think I need explain this function from different perspective.
> >
> > You have explained the A64FX's sector cache function well. I have also
> > read both specs to understand it better. It appears to me that you are
> > not considering the resctrl architecture as part of your solution but
> > instead just forcing your architecture onto the resctrl filesystem.
> > For example, in resctrl the resource groups are not just a directory
> > structure but has significance in what is being represented within the
> > directory (a class of service). The files within a resource group's
> > directory build on that. From your side I have not seen any effort in
> > aligning the sector cache function with the resctrl architecture but instead
> you are just changing resctrl interface to match the A64FX architecture.
> >
> > Could you please take a moment to understand what resctrl is and how
> > it could be mapped to A64FX in a coherent way?
> 
> Previously, my idea is based on how to make instructions use different sectors
> in one task. After I studied resctrl, to utilize resctrl architecture on A64FX, I
> think it’s better to assign one sector to one task. Thanks for your idea that
> "sectors" could be considered the same as the resctrl "classes of service".
> 
> Based on your idea, I am considering the implementation details.
> In this email, I will explain the outline of new proposal, and then please allow
> me to confirm a few technologies about resctrl.

Could you give me some comments & advices?

Best regards,
Tan Shaopeng

> The outline of my proposal is as follows.
> - Add a sector function equivalent to Intel's CAT function into resctrl.
>   (divide shared L2 cache into multiple partitions for multiple cores use)
> - Allocate one sector to one resource group (one CLOSID). Since one
>   core can only be assigned to one resource group, on A64FX each core
>   only uses one sector at a time.
> - Disable A64FX's HPC tag address override function. We only set each
>   core's default sector value according to closid(default sector ID=CLOSID).
> - No L1 cache control since L1 cache is not shared for cores. It is not
>   necessary to add L1 cache interface for schemata file.
> - No need to update schemata interface. Resctrl's L2 cache interface
>   (L2: <cache_id0> = <cbm>; <cache_id1> = <cbm>; ...)
>   will be used as it is. However, on A64FX, <cbm> does not indicate
>   the position of cache partition, only indicate the number of
>   cache ways (size).
> 
> This is the smallest start of incorporating sector cache function into resctrl. I
> will consider if we could add more sector cache features into resctrl (e.g.
> selecting different sectors from one task) after finishing this.
> 
> (some questions are below)
> 
> > >
> > >> On 5/17/2021 1:31 AM, tan.shaopeng@fujitsu.com wrote:
> >
> > > --------
> > > A64FX NUMA-PE-Cache Architecture:
> > > NUMA0:
> > >    PE0:
> > >      L1sector0,L1sector1,L1sector2,L1sector3
> > >    PE1:
> > >      L1sector0,L1sector1,L1sector2,L1sector3
> > >    ...
> > >    PE11:
> > >      L1sector0,L1sector1,L1sector2,L1sector3
> > >
> > >    L2sector0,1/L2sector2,3
> > > NUMA1:
> > >    PE0:
> > >      L1sector0,L1sector1,L1sector2,L1sector3
> > >    ...
> > >    PE11:
> > >      L1sector0,L1sector1,L1sector2,L1sector3
> > >
> > >    L2sector0,1/L2sector2,3
> > > NUMA2:
> > >    ...
> > > NUMA3:
> > >    ...
> > > --------
> > > In A64FX processor, one L1 sector cache capacity setting register is
> > > only for one PE and not shared among PEs. L2 sector cache maximum
> > > capacity setting registers are shared among PEs in same NUMA, and it
> > > is to be noted that changing these registers in one PE influences other PE.
> >
> > Understood. cache affinity is familiar to resctrl. When a CPU becomes
> > online it is discovered which caches/resources it has affinity to.
> > Resources then have CPU mask associated with them to indicate on which
> > CPU a register could be changed to configure the resource/cache. See
> > domain_add_cpu() and struct rdt_domain.
> 
> Is the following understanding correct?
> Struct rdt_domain is a group of online CPUs that share a same cache instance.
> When a CPU is online(resctrl initialization), the domain_add_cpu() function
> add the online cpu to corresponding rdt_domain (in rdt_resource:domains list).
> For example, if there are
> 4 L2 cache instances, then there will be 4 rdt_domain in the list and each CPU
> is assigned to corresponding rdt_domain.
> 
> The set values of cache/memory are stored in the *ctrl_val array (indexed by
> CLOSID) of struct rdt_domain. For example, in CAT function, the CBM value of
> CLOSID=x is stored in ctrl_val [x].
> When we create a resource group and write set values of cache into the
> schemata file, the update_domains() function updates the CBM value to
> ctrl_val [CLOSID = resource group ID] in rdt_domain and updates the CBM
> value to CBM register(MSR_IA32_Lx_CBM_BASE).
> 
> > > The number of ways for L2 Sector ID (0,1 or 2,3) can be set through
> > > any PEs in same NUMA. The sector ID 0,1 and 2,3 are not available at
> > > the same time in same NUMA.
> > >
> > >
> > > I think, in your idea, a resource group will be created for each sector ID.
> > > (> "sectors" could be considered the same as the resctrl "classes of
> > > service") Then, an example of resource group is created as follows.
> > > ・ L1: NUMAX-PEY-L1sector0 (X = 0,1,2,3.Y = 0,1,2 ... 11),
> > > ・ L2: NUMAX-L2sector0 (X = 0,1,2,3)
> > >
> > > In this example, sector with same ID(0) of all PEs is allocated to
> > > resource group. The L1D caches are numbered from
> > > NUMA0_PE0-L1sector0(0) to NUMA4_PE11-L1sector0(47) and the L2
> > caches
> > > numbered from
> > > NUMA0-L2sector0(0) to NUM4-L2sector0(3).
> > > (NUMA number X is from 0-4, PE number Y is from 0-11)
> > > (1) The number of ways of NUMAX-PEY-L1sector0 can be set
> independently
> > >      for each PEs (0-47). When run a task on this resource group,
> > >      we cannot control on which PE the task is running on and how many
> > >      cache ways the task is using.
> >
> > resctrl does not control the affinity on which PE/CPU a task is run.
> > resctrl is an interface with which to configure how resources are
> > allocated on the system. resctrl could thus provide interface with
> > which each sector of each cache instance is assigned a number of cache
> ways.
> > resctrl also provides an interface to assign a task with a class of
> > service (sector id?). Through this the task obtains access to all
> > resources that is allocated to the particular class of service (sector
> > id?). Depending on which CPU the task is running it may indeed
> > experience different performance if the sector id it is running with
> > does not have the same allocations on all cache instances. The affinity of the
> task needs to be managed separately using for example taskset.
> > Please see Documentation/x86/resctrl.rst "Examples for RDT allocation
> usage"
> 
> In resctrl_sched_in(), there are comments as follow:
>   /*
>  * If this task has a closid/rmid assigned, use it.
>   * Else use the closid/rmid assigned to this cpu.
>   */
> I thought when we write PID to tasks file, this task (PID) will only run on the
> CPUs which are specified in cpus file in the same resource group. So, the
> task_struct's closid and cpu's closid is the same.
> When task's closid is different from cpu's closid?
> 
> 
> Best regards,
> Tan Shaopeng


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: About add an A64FX cache control function into resctrl
  2021-07-07 11:26                 ` tan.shaopeng
  2021-07-16  0:49                   ` tan.shaopeng
@ 2021-07-19 23:25                   ` Reinette Chatre
  2021-07-21  8:10                     ` tan.shaopeng
  1 sibling, 1 reply; 20+ messages in thread
From: Reinette Chatre @ 2021-07-19 23:25 UTC (permalink / raw)
  To: tan.shaopeng, 'fenghua.yu@intel.com'
  Cc: 'linux-kernel@vger.kernel.org',
	'linux-arm-kernel@lists.infradead.org',
	'James Morse',
	misono.tomohiro, Luck, Tony

Hi Tan Shaopeng,

On 7/7/2021 4:26 AM, tan.shaopeng@fujitsu.com wrote:
>>> Sorry, I have not explained A64FX's sector cache function well yet.
>>> I think I need explain this function from different perspective.
>>
>> You have explained the A64FX's sector cache function well. I have also read
>> both specs to understand it better. It appears to me that you are not considering
>> the resctrl architecture as part of your solution but instead just forcing your
>> architecture onto the resctrl filesystem. For example, in resctrl the resource
>> groups are not just a directory structure but has significance in what is being
>> represented within the directory (a class of service). The files within a resource
>> group's directory build on that. From your side I have not seen any effort in
>> aligning the sector cache function with the resctrl architecture but instead you
>> are just changing resctrl interface to match the A64FX architecture.
>>
>> Could you please take a moment to understand what resctrl is and how it could
>> be mapped to A64FX in a coherent way?
> 
> Previously, my idea is based on how to make instructions use different
> sectors in one task. After I studied resctrl, to utilize resctrl
> architecture on A64FX, I think it’s better to assign one sector to
> one task. Thanks for your idea that "sectors" could be considered the
> same as the resctrl "classes of service".
> 
> Based on your idea, I am considering the implementation details.
> In this email, I will explain the outline of new proposal, and then
> please allow me to confirm a few technologies about resctrl.
> 
> The outline of my proposal is as follows.
> - Add a sector function equivalent to Intel's CAT function into resctrl.
>    (divide shared L2 cache into multiple partitions for multiple cores use)
> - Allocate one sector to one resource group (one CLOSID). Since one
>    core can only be assigned to one resource group, on A64FX each core
>    only uses one sector at a time.

ok, so a sector is a portion of cache and matches with what can be 
represented with a resource group.

The second part of your comment is not clear to me. In the first part 
you mention: "one core can only be assigned to one resource group" - 
this seems to indicate some static assignment between cores and sectors 
and if this is the case this needs more thinking since the current 
implementation assumes that any core that can access the cache can 
access all resource groups associated with that cache. On the other 
hand, you mention "on A64FX each core only uses one sector at a time" - 
this now sounds dynamic and is how resctrl works since the CPU is 
assigned a single class of service to indicate all resources accessible 
to it.

> - Disable A64FX's HPC tag address override function. We only set each
>    core's default sector value according to closid(default sector ID=CLOSID).
> - No L1 cache control since L1 cache is not shared for cores. It is not
>    necessary to add L1 cache interface for schemata file.
> - No need to update schemata interface. Resctrl's L2 cache interface
>    (L2: <cache_id0> = <cbm>; <cache_id1> = <cbm>; ...)
>    will be used as it is. However, on A64FX, <cbm> does not indicate
>    the position of cache partition, only indicate the number of
>    cache ways (size).

 From what I understand the upcoming MPAM support would make this easier 
to do.

> 
> This is the smallest start of incorporating sector cache function into
> resctrl. I will consider if we could add more sector cache features
> into resctrl (e.g. selecting different sectors from one task) after
> finishing this.
> 
> (some questions are below)
> 
>>>
>>>> On 5/17/2021 1:31 AM, tan.shaopeng@fujitsu.com wrote:
>>
>>> --------
>>> A64FX NUMA-PE-Cache Architecture:
>>> NUMA0:
>>>     PE0:
>>>       L1sector0,L1sector1,L1sector2,L1sector3
>>>     PE1:
>>>       L1sector0,L1sector1,L1sector2,L1sector3
>>>     ...
>>>     PE11:
>>>       L1sector0,L1sector1,L1sector2,L1sector3
>>>
>>>     L2sector0,1/L2sector2,3
>>> NUMA1:
>>>     PE0:
>>>       L1sector0,L1sector1,L1sector2,L1sector3
>>>     ...
>>>     PE11:
>>>       L1sector0,L1sector1,L1sector2,L1sector3
>>>
>>>     L2sector0,1/L2sector2,3
>>> NUMA2:
>>>     ...
>>> NUMA3:
>>>     ...
>>> --------
>>> In A64FX processor, one L1 sector cache capacity setting register is
>>> only for one PE and not shared among PEs. L2 sector cache maximum
>>> capacity setting registers are shared among PEs in same NUMA, and it
>>> is to be noted that changing these registers in one PE influences other PE.
>>
>> Understood. cache affinity is familiar to resctrl. When a CPU becomes online it
>> is discovered which caches/resources it has affinity to.
>> Resources then have CPU mask associated with them to indicate on which
>> CPU a register could be changed to configure the resource/cache. See
>> domain_add_cpu() and struct rdt_domain.
> 
> Is the following understanding correct?
> Struct rdt_domain is a group of online CPUs that share a same cache
> instance. When a CPU is online(resctrl initialization),
> the domain_add_cpu() function add the online cpu to corresponding
> rdt_domain (in rdt_resource:domains list). For example, if there are
> 4 L2 cache instances, then there will be 4 rdt_domain in the list and
> each CPU is assigned to corresponding rdt_domain.

Correct.

> 
> The set values of cache/memory are stored in the *ctrl_val array
> (indexed by CLOSID) of struct rdt_domain. For example, in CAT function,
> the CBM value of CLOSID=x is stored in ctrl_val [x].
> When we create a resource group and write set values of cache into
> the schemata file, the update_domains() function updates the CBM value
> to ctrl_val [CLOSID = resource group ID] in rdt_domain and updates the
> CBM value to CBM register(MSR_IA32_Lx_CBM_BASE).

For the most part, yes. The only part that I would like to clarify is 
that each CLOSID is represented by a different register, which register 
is updated depends on which CLOSID is changed. Could be written as 
MSR_IA32_L2_CBM_CLOSID/MSR_IA32_L3_CBM_CLOSID. The "BASE" register is 
CLOSID 0, the default, and the other registers are determined as offset 
from it.

Also, the registers have the scope of the resource/cache. So, for 
example, if CPU 0 and CPU 1 share a L2 cache then it is only necessary 
to update the register on one of these CPUs.

> 
>>> The number of ways for L2 Sector ID (0,1 or 2,3) can be set through
>>> any PEs in same NUMA. The sector ID 0,1 and 2,3 are not available at
>>> the same time in same NUMA.
>>>
>>>
>>> I think, in your idea, a resource group will be created for each sector ID.
>>> (> "sectors" could be considered the same as the resctrl "classes of
>>> service") Then, an example of resource group is created as follows.
>>> ・ L1: NUMAX-PEY-L1sector0 (X = 0,1,2,3.Y = 0,1,2 ... 11),
>>> ・ L2: NUMAX-L2sector0 (X = 0,1,2,3)
>>>
>>> In this example, sector with same ID(0) of all PEs is allocated to
>>> resource group. The L1D caches are numbered from
>>> NUMA0_PE0-L1sector0(0) to NUMA4_PE11-L1sector0(47) and the L2
>> caches
>>> numbered from
>>> NUMA0-L2sector0(0) to NUM4-L2sector0(3).
>>> (NUMA number X is from 0-4, PE number Y is from 0-11)
>>> (1) The number of ways of NUMAX-PEY-L1sector0 can be set independently
>>>       for each PEs (0-47). When run a task on this resource group,
>>>       we cannot control on which PE the task is running on and how many
>>>       cache ways the task is using.
>>
>> resctrl does not control the affinity on which PE/CPU a task is run.
>> resctrl is an interface with which to configure how resources are allocated on
>> the system. resctrl could thus provide interface with which each sector of each
>> cache instance is assigned a number of cache ways.
>> resctrl also provides an interface to assign a task with a class of service (sector
>> id?). Through this the task obtains access to all resources that is allocated to
>> the particular class of service (sector id?). Depending on which CPU the task is
>> running it may indeed experience different performance if the sector id it is
>> running with does not have the same allocations on all cache instances. The
>> affinity of the task needs to be managed separately using for example taskset.
>> Please see Documentation/x86/resctrl.rst "Examples for RDT allocation usage"
> 
> In resctrl_sched_in(), there are comments as follow:
>    /*
>   * If this task has a closid/rmid assigned, use it.
>    * Else use the closid/rmid assigned to this cpu.
>    */
> I thought when we write PID to tasks file, this task (PID) will only
> run on the CPUs which are specified in cpus file in the same resource
> group. So, the task_struct's closid and cpu's closid is the same.
> When task's closid is different from cpu's closid?

resctrl does not manage the affinity of tasks.

Tony recently summarized the cpus file very well to me: The actual 
semantics of the CPUs file is to associate a CLOSid for a task that is 
in the default resctrl group – while it is running on one of the listed 
CPUs.

To answer your question the task's closid could be different from the 
CPU's closid if the task's closid is 0 while it is running on a CPU that 
is in the cpus file of a non-default resource group.

You can see a summary of the decision flow in section "Resource 
allocation rules" in Documentation/x86/resctrl.rst

The "cpus" file was created in support of the real-time use cases. In 
these use cases a group of CPUs can be designated as supporting the 
real-time work and with their own resource group and assigned the needed 
resources to do the real-time work. A real-time task can then be started 
with affinity to those CPUs and dynamically any kernel threads (that 
will be started on the same CPU) doing work on behalf of this task would 
be able to use the resources set aside for the real-time work.

Reinette

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: About add an A64FX cache control function into resctrl
  2021-07-19 23:25                   ` Reinette Chatre
@ 2021-07-21  8:10                     ` tan.shaopeng
  2021-07-21 23:39                       ` Reinette Chatre
  0 siblings, 1 reply; 20+ messages in thread
From: tan.shaopeng @ 2021-07-21  8:10 UTC (permalink / raw)
  To: 'Reinette Chatre', 'fenghua.yu@intel.com'
  Cc: 'linux-kernel@vger.kernel.org',
	'linux-arm-kernel@lists.infradead.org',
	'James Morse',
	misono.tomohiro, Luck, Tony

Hi Reinette,

> On 7/7/2021 4:26 AM, tan.shaopeng@fujitsu.com wrote:
> >>> Sorry, I have not explained A64FX's sector cache function well yet.
> >>> I think I need explain this function from different perspective.
> >>
> >> You have explained the A64FX's sector cache function well. I have
> >> also read both specs to understand it better. It appears to me that
> >> you are not considering the resctrl architecture as part of your
> >> solution but instead just forcing your architecture onto the resctrl
> >> filesystem. For example, in resctrl the resource groups are not just
> >> a directory structure but has significance in what is being
> >> represented within the directory (a class of service). The files
> >> within a resource group's directory build on that. From your side I
> >> have not seen any effort in aligning the sector cache function with the
> resctrl architecture but instead you are just changing resctrl interface to match
> the A64FX architecture.
> >>
> >> Could you please take a moment to understand what resctrl is and how
> >> it could be mapped to A64FX in a coherent way?
> >
> > Previously, my idea is based on how to make instructions use different
> > sectors in one task. After I studied resctrl, to utilize resctrl
> > architecture on A64FX, I think it’s better to assign one sector to one
> > task. Thanks for your idea that "sectors" could be considered the same
> > as the resctrl "classes of service".
> >
> > Based on your idea, I am considering the implementation details.
> > In this email, I will explain the outline of new proposal, and then
> > please allow me to confirm a few technologies about resctrl.
> >
> > The outline of my proposal is as follows.
> > - Add a sector function equivalent to Intel's CAT function into resctrl.
> >    (divide shared L2 cache into multiple partitions for multiple cores
> > use)
> > - Allocate one sector to one resource group (one CLOSID). Since one
> >    core can only be assigned to one resource group, on A64FX each core
> >    only uses one sector at a time.
> 
> ok, so a sector is a portion of cache and matches with what can be represented
> with a resource group.
> 
> The second part of your comment is not clear to me. In the first part you
> mention: "one core can only be assigned to one resource group" - this seems to
> indicate some static assignment between cores and sectors and if this is the

Sorry, does "static assignment between cores and sectors" mean 
each core always use a fixed sector id? For example, core 0 always 
use sector 0 at any case. It is not.

> case this needs more thinking since the current implementation assumes that
> any core that can access the cache can access all resource groups associated
> with that cache. On the other hand, you mention "on A64FX each core only uses
> one sector at a time" - this now sounds dynamic and is how resctrl works since
> the CPU is assigned a single class of service to indicate all resources
> accessible to it.

It is correct. Each core can be assigned to any resource group, and 
each core only uses one sector at a time. Additionally, which sector 
each core uses depends on the resource group (class of service) ID.

> > - Disable A64FX's HPC tag address override function. We only set each
> >    core's default sector value according to closid(default sector
> ID=CLOSID).
> > - No L1 cache control since L1 cache is not shared for cores. It is not
> >    necessary to add L1 cache interface for schemata file.
> > - No need to update schemata interface. Resctrl's L2 cache interface
> >    (L2: <cache_id0> = <cbm>; <cache_id1> = <cbm>; ...)
> >    will be used as it is. However, on A64FX, <cbm> does not indicate
> >    the position of cache partition, only indicate the number of
> >    cache ways (size).
> 
>  From what I understand the upcoming MPAM support would make this easier
> to do.
> 
> >
> > This is the smallest start of incorporating sector cache function into
> > resctrl. I will consider if we could add more sector cache features
> > into resctrl (e.g. selecting different sectors from one task) after
> > finishing this.
> >
> > (some questions are below)
> >
> >>>
> >>>> On 5/17/2021 1:31 AM, tan.shaopeng@fujitsu.com wrote:
> >>
> >>> --------
> >>> A64FX NUMA-PE-Cache Architecture:
> >>> NUMA0:
> >>>     PE0:
> >>>       L1sector0,L1sector1,L1sector2,L1sector3
> >>>     PE1:
> >>>       L1sector0,L1sector1,L1sector2,L1sector3
> >>>     ...
> >>>     PE11:
> >>>       L1sector0,L1sector1,L1sector2,L1sector3
> >>>
> >>>     L2sector0,1/L2sector2,3
> >>> NUMA1:
> >>>     PE0:
> >>>       L1sector0,L1sector1,L1sector2,L1sector3
> >>>     ...
> >>>     PE11:
> >>>       L1sector0,L1sector1,L1sector2,L1sector3
> >>>
> >>>     L2sector0,1/L2sector2,3
> >>> NUMA2:
> >>>     ...
> >>> NUMA3:
> >>>     ...
> >>> --------
> >>> In A64FX processor, one L1 sector cache capacity setting register is
> >>> only for one PE and not shared among PEs. L2 sector cache maximum
> >>> capacity setting registers are shared among PEs in same NUMA, and it
> >>> is to be noted that changing these registers in one PE influences other PE.
> >>
> >> Understood. cache affinity is familiar to resctrl. When a CPU becomes
> >> online it is discovered which caches/resources it has affinity to.
> >> Resources then have CPU mask associated with them to indicate on
> >> which CPU a register could be changed to configure the
> >> resource/cache. See
> >> domain_add_cpu() and struct rdt_domain.
> >
> > Is the following understanding correct?
> > Struct rdt_domain is a group of online CPUs that share a same cache
> > instance. When a CPU is online(resctrl initialization), the
> > domain_add_cpu() function add the online cpu to corresponding
> > rdt_domain (in rdt_resource:domains list). For example, if there are
> > 4 L2 cache instances, then there will be 4 rdt_domain in the list and
> > each CPU is assigned to corresponding rdt_domain.
> 
> Correct.
> 
> >
> > The set values of cache/memory are stored in the *ctrl_val array
> > (indexed by CLOSID) of struct rdt_domain. For example, in CAT
> > function, the CBM value of CLOSID=x is stored in ctrl_val [x].
> > When we create a resource group and write set values of cache into the
> > schemata file, the update_domains() function updates the CBM value to
> > ctrl_val [CLOSID = resource group ID] in rdt_domain and updates the
> > CBM value to CBM register(MSR_IA32_Lx_CBM_BASE).
> 
> For the most part, yes. The only part that I would like to clarify is that each
> CLOSID is represented by a different register, which register is updated
> depends on which CLOSID is changed. Could be written as
> MSR_IA32_L2_CBM_CLOSID/MSR_IA32_L3_CBM_CLOSID. The "BASE"
> register is CLOSID 0, the default, and the other registers are determined as
> offset from it.
> 
> Also, the registers have the scope of the resource/cache. So, for example, if
> CPU 0 and CPU 1 share a L2 cache then it is only necessary to update the
> register on one of these CPUs.

Thanks for your explanation. I understood it. 
In addition, A64FX's L2 cache setting registers have similar scopes 
of resource/cache, and only necessary to update the register on one of 
these CPUs. 

> >>> The number of ways for L2 Sector ID (0,1 or 2,3) can be set through
> >>> any PEs in same NUMA. The sector ID 0,1 and 2,3 are not available at
> >>> the same time in same NUMA.
> >>>
> >>>
> >>> I think, in your idea, a resource group will be created for each sector ID.
> >>> (> "sectors" could be considered the same as the resctrl "classes of
> >>> service") Then, an example of resource group is created as follows.
> >>> ・ L1: NUMAX-PEY-L1sector0 (X = 0,1,2,3.Y = 0,1,2 ... 11),
> >>> ・ L2: NUMAX-L2sector0 (X = 0,1,2,3)
> >>>
> >>> In this example, sector with same ID(0) of all PEs is allocated to
> >>> resource group. The L1D caches are numbered from
> >>> NUMA0_PE0-L1sector0(0) to NUMA4_PE11-L1sector0(47) and the L2
> >> caches
> >>> numbered from
> >>> NUMA0-L2sector0(0) to NUM4-L2sector0(3).
> >>> (NUMA number X is from 0-4, PE number Y is from 0-11)
> >>> (1) The number of ways of NUMAX-PEY-L1sector0 can be set
> independently
> >>>       for each PEs (0-47). When run a task on this resource group,
> >>>       we cannot control on which PE the task is running on and how
> many
> >>>       cache ways the task is using.
> >>
> >> resctrl does not control the affinity on which PE/CPU a task is run.
> >> resctrl is an interface with which to configure how resources are
> >> allocated on the system. resctrl could thus provide interface with
> >> which each sector of each cache instance is assigned a number of cache
> ways.
> >> resctrl also provides an interface to assign a task with a class of
> >> service (sector id?). Through this the task obtains access to all
> >> resources that is allocated to the particular class of service
> >> (sector id?). Depending on which CPU the task is running it may
> >> indeed experience different performance if the sector id it is
> >> running with does not have the same allocations on all cache instances.
> The affinity of the task needs to be managed separately using for example
> taskset.
> >> Please see Documentation/x86/resctrl.rst "Examples for RDT allocation
> usage"
> >
> > In resctrl_sched_in(), there are comments as follow:
> >    /*
> >   * If this task has a closid/rmid assigned, use it.
> >    * Else use the closid/rmid assigned to this cpu.
> >    */
> > I thought when we write PID to tasks file, this task (PID) will only
> > run on the CPUs which are specified in cpus file in the same resource
> > group. So, the task_struct's closid and cpu's closid is the same.
> > When task's closid is different from cpu's closid?
> 
> resctrl does not manage the affinity of tasks.
> 
> Tony recently summarized the cpus file very well to me: The actual semantics of
> the CPUs file is to associate a CLOSid for a task that is in the default resctrl
> group ? while it is running on one of the listed CPUs.
> 
> To answer your question the task's closid could be different from the CPU's
> closid if the task's closid is 0 while it is running on a CPU that is in the cpus file
> of a non-default resource group.
> 
> You can see a summary of the decision flow in section "Resource allocation
> rules" in Documentation/x86/resctrl.rst
> 
> The "cpus" file was created in support of the real-time use cases. In these use
> cases a group of CPUs can be designated as supporting the real-time work and
> with their own resource group and assigned the needed resources to do the
> real-time work. A real-time task can then be started with affinity to those CPUs
> and dynamically any kernel threads (that will be started on the same CPU)
> doing work on behalf of this task would be able to use the resources set aside
> for the real-time work.

Thanks for your explanation. I understood it. 

I will implement this sector function, and if I have other questions, 
please allow me to mail you. 

Best regards,
Tan Shaopeng

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: About add an A64FX cache control function into resctrl
  2021-07-21  8:10                     ` tan.shaopeng
@ 2021-07-21 23:39                       ` Reinette Chatre
  0 siblings, 0 replies; 20+ messages in thread
From: Reinette Chatre @ 2021-07-21 23:39 UTC (permalink / raw)
  To: tan.shaopeng, 'fenghua.yu@intel.com'
  Cc: 'linux-kernel@vger.kernel.org',
	'linux-arm-kernel@lists.infradead.org',
	'James Morse',
	misono.tomohiro, Luck, Tony

Hi Tan Shaopeng,

On 7/21/2021 1:10 AM, tan.shaopeng@fujitsu.com wrote:
> Hi Reinette,
> 
>> On 7/7/2021 4:26 AM, tan.shaopeng@fujitsu.com wrote:
>>>>> Sorry, I have not explained A64FX's sector cache function well yet.
>>>>> I think I need explain this function from different perspective.
>>>>
>>>> You have explained the A64FX's sector cache function well. I have
>>>> also read both specs to understand it better. It appears to me that
>>>> you are not considering the resctrl architecture as part of your
>>>> solution but instead just forcing your architecture onto the resctrl
>>>> filesystem. For example, in resctrl the resource groups are not just
>>>> a directory structure but has significance in what is being
>>>> represented within the directory (a class of service). The files
>>>> within a resource group's directory build on that. From your side I
>>>> have not seen any effort in aligning the sector cache function with the
>> resctrl architecture but instead you are just changing resctrl interface to match
>> the A64FX architecture.
>>>>
>>>> Could you please take a moment to understand what resctrl is and how
>>>> it could be mapped to A64FX in a coherent way?
>>>
>>> Previously, my idea is based on how to make instructions use different
>>> sectors in one task. After I studied resctrl, to utilize resctrl
>>> architecture on A64FX, I think it’s better to assign one sector to one
>>> task. Thanks for your idea that "sectors" could be considered the same
>>> as the resctrl "classes of service".
>>>
>>> Based on your idea, I am considering the implementation details.
>>> In this email, I will explain the outline of new proposal, and then
>>> please allow me to confirm a few technologies about resctrl.
>>>
>>> The outline of my proposal is as follows.
>>> - Add a sector function equivalent to Intel's CAT function into resctrl.
>>>     (divide shared L2 cache into multiple partitions for multiple cores
>>> use)
>>> - Allocate one sector to one resource group (one CLOSID). Since one
>>>     core can only be assigned to one resource group, on A64FX each core
>>>     only uses one sector at a time.
>>
>> ok, so a sector is a portion of cache and matches with what can be represented
>> with a resource group.
>>
>> The second part of your comment is not clear to me. In the first part you
>> mention: "one core can only be assigned to one resource group" - this seems to
>> indicate some static assignment between cores and sectors and if this is the
> 
> Sorry, does "static assignment between cores and sectors" mean
> each core always use a fixed sector id? For example, core 0 always
> use sector 0 at any case. It is not.
> 
>> case this needs more thinking since the current implementation assumes that
>> any core that can access the cache can access all resource groups associated
>> with that cache. On the other hand, you mention "on A64FX each core only uses
>> one sector at a time" - this now sounds dynamic and is how resctrl works since
>> the CPU is assigned a single class of service to indicate all resources
>> accessible to it.
> 
> It is correct. Each core can be assigned to any resource group, and
> each core only uses one sector at a time. Additionally, which sector
> each core uses depends on the resource group (class of service) ID.

Thank you for clarifying. From what I understand this could be supported  
by existing resctrl flows.

...

>>> In resctrl_sched_in(), there are comments as follow:
>>>     /*
>>>    * If this task has a closid/rmid assigned, use it.
>>>     * Else use the closid/rmid assigned to this cpu.
>>>     */
>>> I thought when we write PID to tasks file, this task (PID) will only
>>> run on the CPUs which are specified in cpus file in the same resource
>>> group. So, the task_struct's closid and cpu's closid is the same.
>>> When task's closid is different from cpu's closid?
>>
>> resctrl does not manage the affinity of tasks.
>>
>> Tony recently summarized the cpus file very well to me: The actual semantics of
>> the CPUs file is to associate a CLOSid for a task that is in the default resctrl
>> group ? while it is running on one of the listed CPUs.
>>
>> To answer your question the task's closid could be different from the CPU's
>> closid if the task's closid is 0 while it is running on a CPU that is in the cpus file
>> of a non-default resource group.
>>
>> You can see a summary of the decision flow in section "Resource allocation
>> rules" in Documentation/x86/resctrl.rst
>>
>> The "cpus" file was created in support of the real-time use cases. In these use
>> cases a group of CPUs can be designated as supporting the real-time work and
>> with their own resource group and assigned the needed resources to do the
>> real-time work. A real-time task can then be started with affinity to those CPUs
>> and dynamically any kernel threads (that will be started on the same CPU)
>> doing work on behalf of this task would be able to use the resources set aside
>> for the real-time work.
> 
> Thanks for your explanation. I understood it.
> 
> I will implement this sector function, and if I have other questions,
> please allow me to mail you.

I will help where I can. You may also be interested in the work James is  
busy with. See his latest series at
https://lore.kernel.org/lkml/20210715173043.14222-1-james.morse@arm.com/

Reinette


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2021-07-21 23:41 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-09  5:46 About add an A64FX cache control function into resctrl tan.shaopeng
2021-04-21  8:37 ` tan.shaopeng
2021-04-21 16:39   ` Reinette Chatre
2021-04-23  8:10     ` tan.shaopeng
2021-04-28  8:16     ` tan.shaopeng
2021-04-29 17:42       ` Reinette Chatre
2021-04-29 17:50         ` Luck, Tony
2021-04-30 11:46           ` Catalin Marinas
2021-05-17  8:29             ` tan.shaopeng
2021-05-17  8:31         ` tan.shaopeng
2021-05-21 17:44           ` Reinette Chatre
2021-05-25  8:45             ` tan.shaopeng
2021-05-26 17:36               ` Reinette Chatre
2021-05-27  8:45                 ` tan.shaopeng
2021-07-07 11:26                 ` tan.shaopeng
2021-07-16  0:49                   ` tan.shaopeng
2021-07-19 23:25                   ` Reinette Chatre
2021-07-21  8:10                     ` tan.shaopeng
2021-07-21 23:39                       ` Reinette Chatre
2021-05-17  8:37     ` tan.shaopeng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).