linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* arm64 + ARM64_64K_PAGES=y
@ 2018-11-06 21:34 Grygorii Strashko
  2018-11-08 12:00 ` Sebastian Andrzej Siewior
  2018-11-09  4:42 ` Anshuman Khandual
  0 siblings, 2 replies; 7+ messages in thread
From: Grygorii Strashko @ 2018-11-06 21:34 UTC (permalink / raw)
  To: linux-rt-users; +Cc: linux-kernel, Linux ARM Mailing List

Hi All,

Do anybody tried to use ARM64 RT with 76K pages enabled?

My attempt shows that enabling  CONFIG_ARM64_64K_PAGES=y increases latencies by ~30%

cyclictest -n -m -Sp98 -q -D2m with CONFIG_ARM64_64K_PAGES=y

  

T: 0 (  772) P:98 I:1000 C: 120000 Min:      7 Act:   13 Avg:   10 Max:      85
T: 1 (  773) P:98 I:1500 C:  79998 Min:      7 Act:   13 Avg:   10 Max:      71
T: 2 (  774) P:98 I:2000 C:  59997 Min:      7 Act:   11 Avg:   11 Max:      64
T: 3 (  775) P:98 I:2500 C:  47996 Min:      7 Act:   14 Avg:   12 Max:      66

  

cyclictest -n -m -Sp98 -q -D2m with CONFIG_ARM64_64K_PAGES=n

  

T: 0 (  697) P:98 I:1000 C: 120000 Min:      7 Act:   10 Avg:    9 Max:      38
T: 1 (  698) P:98 I:1500 C:  79987 Min:      7 Act:   10 Avg:   10 Max:      32
T: 2 (  699) P:98 I:2000 C:  59981 Min:      7 Act:   14 Avg:   11 Max:      46
T: 3 (  700) P:98 I:2500 C:  47977 Min:      6 Act:   11 Avg:   10 Max:      45



-- 
regards,
-grygorii

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: arm64 + ARM64_64K_PAGES=y
  2018-11-06 21:34 arm64 + ARM64_64K_PAGES=y Grygorii Strashko
@ 2018-11-08 12:00 ` Sebastian Andrzej Siewior
  2018-11-08 18:14   ` Grygorii Strashko
  2018-11-09  4:42 ` Anshuman Khandual
  1 sibling, 1 reply; 7+ messages in thread
From: Sebastian Andrzej Siewior @ 2018-11-08 12:00 UTC (permalink / raw)
  To: Grygorii Strashko; +Cc: linux-rt-users, linux-kernel, Linux ARM Mailing List

On 2018-11-06 15:34:55 [-0600], Grygorii Strashko wrote:
> Hi All,
Hi,

> Do anybody tried to use ARM64 RT with 76K pages enabled?

75 would be an off by one but this :)

> My attempt shows that enabling  CONFIG_ARM64_64K_PAGES=y increases latencies by ~30%
> 
> cyclictest -n -m -Sp98 -q -D2m with CONFIG_ARM64_64K_PAGES=y
> 
> 
> T: 0 (  772) P:98 I:1000 C: 120000 Min:      7 Act:   13 Avg:   10 Max:      85
> T: 1 (  773) P:98 I:1500 C:  79998 Min:      7 Act:   13 Avg:   10 Max:      71
> T: 2 (  774) P:98 I:2000 C:  59997 Min:      7 Act:   11 Avg:   11 Max:      64
> T: 3 (  775) P:98 I:2500 C:  47996 Min:      7 Act:   14 Avg:   12 Max:      66
> 
> 
> cyclictest -n -m -Sp98 -q -D2m with CONFIG_ARM64_64K_PAGES=n
> 
> 
> T: 0 (  697) P:98 I:1000 C: 120000 Min:      7 Act:   10 Avg:    9 Max:      38
> T: 1 (  698) P:98 I:1500 C:  79987 Min:      7 Act:   10 Avg:   10 Max:      32
> T: 2 (  699) P:98 I:2000 C:  59981 Min:      7 Act:   14 Avg:   11 Max:      46
> T: 3 (  700) P:98 I:2500 C:  47977 Min:      6 Act:   11 Avg:   10 Max:      45

So this is an idle system?
The Kconfig help says "faster TLB lookup". Interesting.
Are the 16k pages in between (latency wise) by any chance?

> -- 
> regards,
> -grygorii

Sebastian

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: arm64 + ARM64_64K_PAGES=y
  2018-11-08 12:00 ` Sebastian Andrzej Siewior
@ 2018-11-08 18:14   ` Grygorii Strashko
  2018-11-09 19:15     ` Grygorii Strashko
  0 siblings, 1 reply; 7+ messages in thread
From: Grygorii Strashko @ 2018-11-08 18:14 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-rt-users, linux-kernel, Linux ARM Mailing List



On 11/8/18 6:00 AM, Sebastian Andrzej Siewior wrote:
> On 2018-11-06 15:34:55 [-0600], Grygorii Strashko wrote:
>> Hi All,
> Hi,
> 
>> Do anybody tried to use ARM64 RT with 76K pages enabled?
> 
> 75 would be an off by one but this :)

Ops 8-). at least subj is correct.

> 
>> My attempt shows that enabling  CONFIG_ARM64_64K_PAGES=y increases latencies by ~30%
>>
>> cyclictest -n -m -Sp98 -q -D2m with CONFIG_ARM64_64K_PAGES=y
>>
>>
>> T: 0 (  772) P:98 I:1000 C: 120000 Min:      7 Act:   13 Avg:   10 Max:      85
>> T: 1 (  773) P:98 I:1500 C:  79998 Min:      7 Act:   13 Avg:   10 Max:      71
>> T: 2 (  774) P:98 I:2000 C:  59997 Min:      7 Act:   11 Avg:   11 Max:      64
>> T: 3 (  775) P:98 I:2500 C:  47996 Min:      7 Act:   14 Avg:   12 Max:      66
>>
>>
>> cyclictest -n -m -Sp98 -q -D2m with CONFIG_ARM64_64K_PAGES=n
>>
>>
>> T: 0 (  697) P:98 I:1000 C: 120000 Min:      7 Act:   10 Avg:    9 Max:      38
>> T: 1 (  698) P:98 I:1500 C:  79987 Min:      7 Act:   10 Avg:   10 Max:      32
>> T: 2 (  699) P:98 I:2000 C:  59981 Min:      7 Act:   14 Avg:   11 Max:      46
>> T: 3 (  700) P:98 I:2500 C:  47977 Min:      6 Act:   11 Avg:   10 Max:      45
> 
> So this is an idle system?

Yes (in general) - it's collected with systemd, so some daemons are active.

> The Kconfig help says "faster TLB lookup". Interesting.
> Are the 16k pages in between (latency wise) by any chance?

I'll try it.


-- 
regards,
-grygorii

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: arm64 + ARM64_64K_PAGES=y
  2018-11-06 21:34 arm64 + ARM64_64K_PAGES=y Grygorii Strashko
  2018-11-08 12:00 ` Sebastian Andrzej Siewior
@ 2018-11-09  4:42 ` Anshuman Khandual
  1 sibling, 0 replies; 7+ messages in thread
From: Anshuman Khandual @ 2018-11-09  4:42 UTC (permalink / raw)
  To: Grygorii Strashko, linux-rt-users; +Cc: linux-kernel, Linux ARM Mailing List



On 11/07/2018 03:04 AM, Grygorii Strashko wrote:
> Hi All,
> 
> Do anybody tried to use ARM64 RT with 76K pages enabled?
> 
> My attempt shows that enabling  CONFIG_ARM64_64K_PAGES=y increases latencies by ~30%

Depends on what the workload is actually doing. 64K pages should help if
the mapping is multiple of 64K, persistent and accesses patterns are more
or less linear to be nice with the TLB. 64K can take bit more time if the
memory requirement is way smaller than 64K in which case latency probably
can increase just to zero out the single page allocated. Latency can add
up if this happens on a regular basis. perf report can help find out more
on this.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: arm64 + ARM64_64K_PAGES=y
  2018-11-08 18:14   ` Grygorii Strashko
@ 2018-11-09 19:15     ` Grygorii Strashko
  2018-11-12 14:27       ` Andre Przywara
  0 siblings, 1 reply; 7+ messages in thread
From: Grygorii Strashko @ 2018-11-09 19:15 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-rt-users, linux-kernel, Linux ARM Mailing List



On 11/8/18 12:14 PM, Grygorii Strashko wrote:
> 
> 
> On 11/8/18 6:00 AM, Sebastian Andrzej Siewior wrote:
>> On 2018-11-06 15:34:55 [-0600], Grygorii Strashko wrote:
>>> Hi All,
>> Hi,
>>
>>> Do anybody tried to use ARM64 RT with 76K pages enabled?
>>
>> 75 would be an off by one but this :)
> 
> Ops 8-). at least subj is correct.
> 
>>
>>> My attempt shows that enabling  CONFIG_ARM64_64K_PAGES=y increases latencies by ~30%
>>>
>>> cyclictest -n -m -Sp98 -q -D2m with CONFIG_ARM64_64K_PAGES=y
>>>
>>>
>>> T: 0 (  772) P:98 I:1000 C: 120000 Min:      7 Act:   13 Avg:   10 Max:      85
>>> T: 1 (  773) P:98 I:1500 C:  79998 Min:      7 Act:   13 Avg:   10 Max:      71
>>> T: 2 (  774) P:98 I:2000 C:  59997 Min:      7 Act:   11 Avg:   11 Max:      64
>>> T: 3 (  775) P:98 I:2500 C:  47996 Min:      7 Act:   14 Avg:   12 Max:      66
>>>
>>>
>>> cyclictest -n -m -Sp98 -q -D2m with CONFIG_ARM64_64K_PAGES=n
>>>
>>>
>>> T: 0 (  697) P:98 I:1000 C: 120000 Min:      7 Act:   10 Avg:    9 Max:      38
>>> T: 1 (  698) P:98 I:1500 C:  79987 Min:      7 Act:   10 Avg:   10 Max:      32
>>> T: 2 (  699) P:98 I:2000 C:  59981 Min:      7 Act:   14 Avg:   11 Max:      46
>>> T: 3 (  700) P:98 I:2500 C:  47977 Min:      6 Act:   11 Avg:   10 Max:      45
>>
>> So this is an idle system?
> 
> Yes (in general) - it's collected with systemd, so some daemons are active.
> 
>> The Kconfig help says "faster TLB lookup". Interesting.
>> Are the 16k pages in between (latency wise) by any chance?
> 
> I'll try it.

no i'll not, at least not fast. with 16k pages enabled I can't boot TI 4.14 kernel
-  4.14.71-rt44.
No msg in log, just "Starting kernel ..."

-- 
regards,
-grygorii

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: arm64 + ARM64_64K_PAGES=y
  2018-11-09 19:15     ` Grygorii Strashko
@ 2018-11-12 14:27       ` Andre Przywara
  2018-11-12 21:22         ` Grygorii Strashko
  0 siblings, 1 reply; 7+ messages in thread
From: Andre Przywara @ 2018-11-12 14:27 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: Sebastian Andrzej Siewior, linux-rt-users,
	Linux ARM Mailing List, linux-kernel

On Fri, 9 Nov 2018 13:15:47 -0600
Grygorii Strashko <grygorii.strashko@ti.com> wrote:

Hi,

> On 11/8/18 12:14 PM, Grygorii Strashko wrote:
> > 
> > 
> > On 11/8/18 6:00 AM, Sebastian Andrzej Siewior wrote:  
> >> On 2018-11-06 15:34:55 [-0600], Grygorii Strashko wrote:  
> >>> Hi All,  
> >> Hi,
> >>  
> >>> Do anybody tried to use ARM64 RT with 76K pages enabled?  
> >>
> >> 75 would be an off by one but this :)  
> > 
> > Ops 8-). at least subj is correct.
> >   
> >>  
> >>> My attempt shows that enabling  CONFIG_ARM64_64K_PAGES=y
> >>> increases latencies by ~30%

That's not really surprising. Performance on systems using a bigger page
size granules might have some trade-offs (bigger memory overhead, worse
cache utilization), so 64K pages might not be really great for your
particular workload. You would probably need a real performance
analysis (using perf, for instance) to pinpoint TLB misses as your
bottleneck.

> >>> cyclictest -n -m -Sp98 -q -D2m with  =y
> >>>
> >>>
> >>> T: 0 (  772) P:98 I:1000 C: 120000 Min:      7 Act:   13 Avg:
> >>> 10 Max:      85 T: 1 (  773) P:98 I:1500 C:  79998 Min:      7
> >>> Act:   13 Avg:   10 Max:      71 T: 2 (  774) P:98 I:2000 C:
> >>> 59997 Min:      7 Act:   11 Avg:   11 Max:      64 T: 3 (  775)
> >>> P:98 I:2500 C:  47996 Min:      7 Act:   14 Avg:   12 Max:      66
> >>>
> >>>
> >>> cyclictest -n -m -Sp98 -q -D2m with CONFIG_ARM64_64K_PAGES=n
> >>>
> >>>
> >>> T: 0 (  697) P:98 I:1000 C: 120000 Min:      7 Act:   10 Avg:
> >>> 9 Max:      38 T: 1 (  698) P:98 I:1500 C:  79987 Min:      7
> >>> Act:   10 Avg:   10 Max:      32 T: 2 (  699) P:98 I:2000 C:
> >>> 59981 Min:      7 Act:   14 Avg:   11 Max:      46 T: 3 (  700)
> >>> P:98 I:2500 C:  47977 Min:      6 Act:   11 Avg:   10 Max:
> >>> 45  
> >>
> >> So this is an idle system?  
> > 
> > Yes (in general) - it's collected with systemd, so some daemons are
> > active. 
> >> The Kconfig help says "faster TLB lookup". Interesting.
> >> Are the 16k pages in between (latency wise) by any chance?  
> > 
> > I'll try it.  
> 
> no i'll not, at least not fast. with 16k pages enabled I can't boot
> TI 4.14 kernel
> -  4.14.71-rt44.
> No msg in log, just "Starting kernel ..."

You need a core that actually supports 16K pages (supporting
certain page size granules is architecturally optional).
From the Arm Ltd. cores it's Cortex-A73, A75 or A55, possibly other
newer ones as well. Cortex-A53, A57 and A72 do not support 16k pages.

Cheers,
Andre.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: arm64 + ARM64_64K_PAGES=y
  2018-11-12 14:27       ` Andre Przywara
@ 2018-11-12 21:22         ` Grygorii Strashko
  0 siblings, 0 replies; 7+ messages in thread
From: Grygorii Strashko @ 2018-11-12 21:22 UTC (permalink / raw)
  To: Andre Przywara
  Cc: Sebastian Andrzej Siewior, linux-rt-users,
	Linux ARM Mailing List, linux-kernel



On 11/12/18 8:27 AM, Andre Przywara wrote:
> On Fri, 9 Nov 2018 13:15:47 -0600
> Grygorii Strashko <grygorii.strashko@ti.com> wrote:
> 
> Hi,
> 
>> On 11/8/18 12:14 PM, Grygorii Strashko wrote:
>>>
>>>
>>> On 11/8/18 6:00 AM, Sebastian Andrzej Siewior wrote:
>>>> On 2018-11-06 15:34:55 [-0600], Grygorii Strashko wrote:
>>>>> Hi All,
>>>> Hi,
>>>>   
>>>>> Do anybody tried to use ARM64 RT with 76K pages enabled?
>>>>
>>>> 75 would be an off by one but this :)
>>>
>>> Ops 8-). at least subj is correct.
>>>    
>>>>   
>>>>> My attempt shows that enabling  CONFIG_ARM64_64K_PAGES=y
>>>>> increases latencies by ~30%
> 
> That's not really surprising. Performance on systems using a bigger page
> size granules might have some trade-offs (bigger memory overhead, worse
> cache utilization), so 64K pages might not be really great for your
> particular workload. You would probably need a real performance
> analysis (using perf, for instance) to pinpoint TLB misses as your
> bottleneck.
> 
>>>>> cyclictest -n -m -Sp98 -q -D2m with  =y
>>>>>
>>>>>
>>>>> T: 0 (  772) P:98 I:1000 C: 120000 Min:      7 Act:   13 Avg:
>>>>> 10 Max:      85 T: 1 (  773) P:98 I:1500 C:  79998 Min:      7
>>>>> Act:   13 Avg:   10 Max:      71 T: 2 (  774) P:98 I:2000 C:
>>>>> 59997 Min:      7 Act:   11 Avg:   11 Max:      64 T: 3 (  775)
>>>>> P:98 I:2500 C:  47996 Min:      7 Act:   14 Avg:   12 Max:      66
>>>>>
>>>>>
>>>>> cyclictest -n -m -Sp98 -q -D2m with CONFIG_ARM64_64K_PAGES=n
>>>>>
>>>>>
>>>>> T: 0 (  697) P:98 I:1000 C: 120000 Min:      7 Act:   10 Avg:
>>>>> 9 Max:      38 T: 1 (  698) P:98 I:1500 C:  79987 Min:      7
>>>>> Act:   10 Avg:   10 Max:      32 T: 2 (  699) P:98 I:2000 C:
>>>>> 59981 Min:      7 Act:   14 Avg:   11 Max:      46 T: 3 (  700)
>>>>> P:98 I:2500 C:  47977 Min:      6 Act:   11 Avg:   10 Max:
>>>>> 45
>>>>
>>>> So this is an idle system?
>>>
>>> Yes (in general) - it's collected with systemd, so some daemons are
>>> active.
>>>> The Kconfig help says "faster TLB lookup". Interesting.
>>>> Are the 16k pages in between (latency wise) by any chance?
>>>
>>> I'll try it.
>>
>> no i'll not, at least not fast. with 16k pages enabled I can't boot
>> TI 4.14 kernel
>> -  4.14.71-rt44.
>> No msg in log, just "Starting kernel ..."
> 
> You need a core that actually supports 16K pages (supporting
> certain page size granules is architecturally optional).
>  From the Arm Ltd. cores it's Cortex-A73, A75 or A55, possibly other
> newer ones as well. Cortex-A53, A57 and A72 do not support 16k pages.

Thank a lot for you reply.

-- 
regards,
-grygorii

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-11-12 21:23 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-06 21:34 arm64 + ARM64_64K_PAGES=y Grygorii Strashko
2018-11-08 12:00 ` Sebastian Andrzej Siewior
2018-11-08 18:14   ` Grygorii Strashko
2018-11-09 19:15     ` Grygorii Strashko
2018-11-12 14:27       ` Andre Przywara
2018-11-12 21:22         ` Grygorii Strashko
2018-11-09  4:42 ` Anshuman Khandual

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).