All of lore.kernel.org
 help / color / mirror / Atom feed
* arm64 board boot pauses on linux-next (resend)
@ 2018-12-12 16:48 John Garry
  2018-12-12 17:17 ` Robin Murphy
  0 siblings, 1 reply; 11+ messages in thread
From: John Garry @ 2018-12-12 16:48 UTC (permalink / raw)
  To: linux-arm-kernel; +Cc: Linuxarm

Hi all,

I am finding our arm64 D05 board particularly slow to boot from 
linux-next, specifically a 30+ second pause when setting up CPU features.

I tried to bisect, but I am finding bisect holes (board does not boot at 
all).

Snippet good, like v4.20-rc6:
[    5.482756] smp: Brought up 4 nodes, 64 CPUs
[    7.423242] SMP: Total of 64 processors activated.
[    7.428087] CPU features: detected: GIC system register CPU interface
[    7.434620] CPU features: detected: 32-bit EL0 Support
[    7.439813] CPU features: detected: CRC32 instructions
[    7.530336] CPU: All CPU(s) started at EL2
[    7.534599] alternatives: patching kernel code
[    7.556042] devtmpfs: initialized
[    7.559982] clocksource: jiffies: mask: 0xffffffff max_cycles: 
0xffffffff, max_idle_ns: 7645041785100000 ns
[    7.569991] futex hash table entries: 16384 (order: 8, 1048576 bytes)
[    7.577262] pinctrl core: initialized pinctrl subsystem
[    7.582947] SMBIOS 3.0.0 present.
[    7.586293] DMI: Huawei Taishan 2280 /D05, BIOS Hisilicon D05 IT17 
Nemo 2.0 RC0 10/05/2018
[    7.594812] NET: Registered protocol family 16
[    7.600073] audit: initializing netlink subsys (disabled)
[    7.605593] audit: type=2000 audit(5.072:1): state=initialized 
audit_enabled=0 res=1


Snippet next-20181211:
[    5.483420] GICv3: CPU63: found redistributor 70303 region 
63:0x000004006d4c0000
[    5.483438] GICv3: CPU63: using allocated LPI pending table 
@0x0000001fb8dd0000
[    5.483578] arch_timer: CPU63: Trapping CNTVCT access
[    5.483582] CPU63: Booted secondary processor 0x0000070303 [0x410fd082]
[    5.483666] smp: Brought up 4 nodes, 64 CPUs
[    7.424162] SMP: Total of 64 processors activated.
[    7.429006] CPU features: detected: GIC system register CPU interface
[    7.435538] CPU features: detected: 32-bit EL0 Support
[    7.440731] CPU features: detected: CRC32 instructions
[   40.107984] CPU: All CPU(s) started at EL2
[   40.112264] alternatives: patching kernel code
[   40.134299] devtmpfs: initialized
[   40.138196] clocksource: jiffies: mask: 0xffffffff max_cycles: 
0xffffffff, max_idle_ns: 7645041785100000 ns
[   40.148209] futex hash table entries: 16384 (order: 8, 1048576 bytes)
[   40.155486] pinctrl core: initialized pinctrl subsystem
[   40.161109] SMBIOS 3.0.0 present.
[   40.164456] DMI: Huawei Taishan 2280 /D05, BIOS Hisilicon D05 IT17 
Nemo 2.0 RC0 10/05/2018
[   40.172985] NET: Registered protocol family 16
[   40.178236] audit: initializing netlink subsys (disabled)
[   40.183748] audit: type=2000 audit(5.072:1): state=initialized 
audit_enabled=0 res=1

Anyone else notice a similar issue?

Our nextgen D06 board does not seem to have this issue.

Thanks,
John


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: arm64 board boot pauses on linux-next (resend)
  2018-12-12 16:48 arm64 board boot pauses on linux-next (resend) John Garry
@ 2018-12-12 17:17 ` Robin Murphy
  2018-12-12 18:07   ` John Garry
  0 siblings, 1 reply; 11+ messages in thread
From: Robin Murphy @ 2018-12-12 17:17 UTC (permalink / raw)
  To: John Garry, linux-arm-kernel; +Cc: Linuxarm, Suzuki K Poulose

[ +Suzuki ]

Hi John,

On 12/12/2018 16:48, John Garry wrote:
> Hi all,
> 
> I am finding our arm64 D05 board particularly slow to boot from 
> linux-next, specifically a 30+ second pause when setting up CPU features.

That does look like it's almost certainly spending the mystery time in 
setup_cpu_features() itself.

> I tried to bisect, but I am finding bisect holes (board does not boot at 
> all).

I'd suggest focusing on the arm64 for-next/core branch, since we have 
some cpufeature rework queued there which was supposed to make it go 
quicker...

Robin.

> 
> Snippet good, like v4.20-rc6:
> [    5.482756] smp: Brought up 4 nodes, 64 CPUs
> [    7.423242] SMP: Total of 64 processors activated.
> [    7.428087] CPU features: detected: GIC system register CPU interface
> [    7.434620] CPU features: detected: 32-bit EL0 Support
> [    7.439813] CPU features: detected: CRC32 instructions
> [    7.530336] CPU: All CPU(s) started at EL2
> [    7.534599] alternatives: patching kernel code
> [    7.556042] devtmpfs: initialized
> [    7.559982] clocksource: jiffies: mask: 0xffffffff max_cycles: 
> 0xffffffff, max_idle_ns: 7645041785100000 ns
> [    7.569991] futex hash table entries: 16384 (order: 8, 1048576 bytes)
> [    7.577262] pinctrl core: initialized pinctrl subsystem
> [    7.582947] SMBIOS 3.0.0 present.
> [    7.586293] DMI: Huawei Taishan 2280 /D05, BIOS Hisilicon D05 IT17 
> Nemo 2.0 RC0 10/05/2018
> [    7.594812] NET: Registered protocol family 16
> [    7.600073] audit: initializing netlink subsys (disabled)
> [    7.605593] audit: type=2000 audit(5.072:1): state=initialized 
> audit_enabled=0 res=1
> 
> 
> Snippet next-20181211:
> [    5.483420] GICv3: CPU63: found redistributor 70303 region 
> 63:0x000004006d4c0000
> [    5.483438] GICv3: CPU63: using allocated LPI pending table 
> @0x0000001fb8dd0000
> [    5.483578] arch_timer: CPU63: Trapping CNTVCT access
> [    5.483582] CPU63: Booted secondary processor 0x0000070303 [0x410fd082]
> [    5.483666] smp: Brought up 4 nodes, 64 CPUs
> [    7.424162] SMP: Total of 64 processors activated.
> [    7.429006] CPU features: detected: GIC system register CPU interface
> [    7.435538] CPU features: detected: 32-bit EL0 Support
> [    7.440731] CPU features: detected: CRC32 instructions
> [   40.107984] CPU: All CPU(s) started at EL2
> [   40.112264] alternatives: patching kernel code
> [   40.134299] devtmpfs: initialized
> [   40.138196] clocksource: jiffies: mask: 0xffffffff max_cycles: 
> 0xffffffff, max_idle_ns: 7645041785100000 ns
> [   40.148209] futex hash table entries: 16384 (order: 8, 1048576 bytes)
> [   40.155486] pinctrl core: initialized pinctrl subsystem
> [   40.161109] SMBIOS 3.0.0 present.
> [   40.164456] DMI: Huawei Taishan 2280 /D05, BIOS Hisilicon D05 IT17 
> Nemo 2.0 RC0 10/05/2018
> [   40.172985] NET: Registered protocol family 16
> [   40.178236] audit: initializing netlink subsys (disabled)
> [   40.183748] audit: type=2000 audit(5.072:1): state=initialized 
> audit_enabled=0 res=1
> 
> Anyone else notice a similar issue?
> 
> Our nextgen D06 board does not seem to have this issue.
> 
> Thanks,
> John
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: arm64 board boot pauses on linux-next (resend)
  2018-12-12 17:17 ` Robin Murphy
@ 2018-12-12 18:07   ` John Garry
  2018-12-12 19:16     ` Suzuki K Poulose
  0 siblings, 1 reply; 11+ messages in thread
From: John Garry @ 2018-12-12 18:07 UTC (permalink / raw)
  To: Robin Murphy, linux-arm-kernel; +Cc: Linuxarm, Suzuki K Poulose

On 12/12/2018 17:17, Robin Murphy wrote:
> [ +Suzuki ]
>
> Hi John,
>

Hi Robin,

> On 12/12/2018 16:48, John Garry wrote:
>> Hi all,
>>
>> I am finding our arm64 D05 board particularly slow to boot from
>> linux-next, specifically a 30+ second pause when setting up CPU features.
>
> That does look like it's almost certainly spending the mystery time in
> setup_cpu_features() itself.

Seems to be "Kernel page table isolation (KPTI)" feature which we hang 
on, but I would not say that's conclusive.

>
>> I tried to bisect, but I am finding bisect holes (board does not boot
>> at all).
>
> I'd suggest focusing on the arm64 for-next/core branch, since we have
> some cpufeature rework queued there which was supposed to make it go
> quicker...

Tip of this branch doesn't exhibit the issue.

Cheers,
John

>
> Robin.
>
>>
>> Snippet good, like v4.20-rc6:
>> [    5.482756] smp: Brought up 4 nodes, 64 CPUs
>> [    7.423242] SMP: Total of 64 processors activated.
>> [    7.428087] CPU features: detected: GIC system register CPU interface
>> [    7.434620] CPU features: detected: 32-bit EL0 Support
>> [    7.439813] CPU features: detected: CRC32 instructions
>> [    7.530336] CPU: All CPU(s) started at EL2
>> [    7.534599] alternatives: patching kernel code
>> [    7.556042] devtmpfs: initialized
>> [    7.559982] clocksource: jiffies: mask: 0xffffffff max_cycles:
>> 0xffffffff, max_idle_ns: 7645041785100000 ns
>> [    7.569991] futex hash table entries: 16384 (order: 8, 1048576 bytes)
>> [    7.577262] pinctrl core: initialized pinctrl subsystem
>> [    7.582947] SMBIOS 3.0.0 present.
>> [    7.586293] DMI: Huawei Taishan 2280 /D05, BIOS Hisilicon D05 IT17
>> Nemo 2.0 RC0 10/05/2018
>> [    7.594812] NET: Registered protocol family 16
>> [    7.600073] audit: initializing netlink subsys (disabled)
>> [    7.605593] audit: type=2000 audit(5.072:1): state=initialized
>> audit_enabled=0 res=1
>>
>>
>> Snippet next-20181211:
>> [    5.483420] GICv3: CPU63: found redistributor 70303 region
>> 63:0x000004006d4c0000
>> [    5.483438] GICv3: CPU63: using allocated LPI pending table
>> @0x0000001fb8dd0000
>> [    5.483578] arch_timer: CPU63: Trapping CNTVCT access
>> [    5.483582] CPU63: Booted secondary processor 0x0000070303
>> [0x410fd082]
>> [    5.483666] smp: Brought up 4 nodes, 64 CPUs
>> [    7.424162] SMP: Total of 64 processors activated.
>> [    7.429006] CPU features: detected: GIC system register CPU interface
>> [    7.435538] CPU features: detected: 32-bit EL0 Support
>> [    7.440731] CPU features: detected: CRC32 instructions
>> [   40.107984] CPU: All CPU(s) started at EL2
>> [   40.112264] alternatives: patching kernel code
>> [   40.134299] devtmpfs: initialized
>> [   40.138196] clocksource: jiffies: mask: 0xffffffff max_cycles:
>> 0xffffffff, max_idle_ns: 7645041785100000 ns
>> [   40.148209] futex hash table entries: 16384 (order: 8, 1048576 bytes)
>> [   40.155486] pinctrl core: initialized pinctrl subsystem
>> [   40.161109] SMBIOS 3.0.0 present.
>> [   40.164456] DMI: Huawei Taishan 2280 /D05, BIOS Hisilicon D05 IT17
>> Nemo 2.0 RC0 10/05/2018
>> [   40.172985] NET: Registered protocol family 16
>> [   40.178236] audit: initializing netlink subsys (disabled)
>> [   40.183748] audit: type=2000 audit(5.072:1): state=initialized
>> audit_enabled=0 res=1
>>
>> Anyone else notice a similar issue?
>>
>> Our nextgen D06 board does not seem to have this issue.
>>
>> Thanks,
>> John
>>
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
> .
>



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: arm64 board boot pauses on linux-next (resend)
  2018-12-12 18:07   ` John Garry
@ 2018-12-12 19:16     ` Suzuki K Poulose
  2018-12-13 11:27       ` John Garry
  0 siblings, 1 reply; 11+ messages in thread
From: Suzuki K Poulose @ 2018-12-12 19:16 UTC (permalink / raw)
  To: John Garry, Robin Murphy, linux-arm-kernel; +Cc: Linuxarm

Hi

On 12/12/2018 06:07 PM, John Garry wrote:
> On 12/12/2018 17:17, Robin Murphy wrote:
>> [ +Suzuki ]
>>
>> Hi John,
>>
> 
> Hi Robin,
> 
>> On 12/12/2018 16:48, John Garry wrote:
>>> Hi all,
>>>
>>> I am finding our arm64 D05 board particularly slow to boot from
>>> linux-next, specifically a 30+ second pause when setting up CPU 
>>> features.
>>
>> That does look like it's almost certainly spending the mystery time in
>> setup_cpu_features() itself.
> 
> Seems to be "Kernel page table isolation (KPTI)" feature which we hang 
> on, but I would not say that's conclusive.
> 

I think I have an idea what could be happening. The cpu_enable()
for KPTI, waits for all the secondary CPUs to enter a busy loop,
before installing the non-global mapping. So, with the changes in
-next, we batch the cpu_enabl() callbacks, which implies, the seconary
CPUs end up in the "cpu_enable()" for KPTI at different times.

Could you check if the following hack makes it any better ?

---8>---

hack: Reorder KPTI to the top


Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
  arch/arm64/include/asm/cpucaps.h | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/cpucaps.h 
b/arch/arm64/include/asm/cpucaps.h
index a89f587..1363e09 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -18,7 +18,7 @@
  #ifndef __ASM_CPUCAPS_H
  #define __ASM_CPUCAPS_H

-#define ARM64_WORKAROUND_CLEAN_CACHE		0
+#define ARM64_UNMAP_KERNEL_AT_EL0		0
  #define ARM64_WORKAROUND_DEVICE_LOAD_ACQUIRE	1
  #define ARM64_WORKAROUND_845719			2
  #define ARM64_HAS_SYSREG_GIC_CPUIF		3
@@ -41,7 +41,7 @@
  #define ARM64_WORKAROUND_CAVIUM_30115		20
  #define ARM64_HAS_DCPOP				21
  #define ARM64_SVE				22
-#define ARM64_UNMAP_KERNEL_AT_EL0		23
+#define ARM64_WORKAROUND_CLEAN_CACHE		23
  #define ARM64_HARDEN_BRANCH_PREDICTOR		24
  #define ARM64_HAS_RAS_EXTN			25
  #define ARM64_WORKAROUND_843419			26
-- 
2.7.4


>>
>>> I tried to bisect, but I am finding bisect holes (board does not boot
>>> at all).
>>
>> I'd suggest focusing on the arm64 for-next/core branch, since we have
>> some cpufeature rework queued there which was supposed to make it go
>> quicker...
> 
> Tip of this branch doesn't exhibit the issue.
> 
> Cheers,
> John
> 
>>
>> Robin.
>>
>>>
>>> Snippet good, like v4.20-rc6:
>>> [    5.482756] smp: Brought up 4 nodes, 64 CPUs
>>> [    7.423242] SMP: Total of 64 processors activated.
>>> [    7.428087] CPU features: detected: GIC system register CPU interface
>>> [    7.434620] CPU features: detected: 32-bit EL0 Support
>>> [    7.439813] CPU features: detected: CRC32 instructions
>>> [    7.530336] CPU: All CPU(s) started at EL2
>>> [    7.534599] alternatives: patching kernel code
>>> [    7.556042] devtmpfs: initialized
>>> [    7.559982] clocksource: jiffies: mask: 0xffffffff max_cycles:
>>> 0xffffffff, max_idle_ns: 7645041785100000 ns
>>> [    7.569991] futex hash table entries: 16384 (order: 8, 1048576 bytes)
>>> [    7.577262] pinctrl core: initialized pinctrl subsystem
>>> [    7.582947] SMBIOS 3.0.0 present.
>>> [    7.586293] DMI: Huawei Taishan 2280 /D05, BIOS Hisilicon D05 IT17
>>> Nemo 2.0 RC0 10/05/2018
>>> [    7.594812] NET: Registered protocol family 16
>>> [    7.600073] audit: initializing netlink subsys (disabled)
>>> [    7.605593] audit: type=2000 audit(5.072:1): state=initialized
>>> audit_enabled=0 res=1
>>>
>>>
>>> Snippet next-20181211:
>>> [    5.483420] GICv3: CPU63: found redistributor 70303 region
>>> 63:0x000004006d4c0000
>>> [    5.483438] GICv3: CPU63: using allocated LPI pending table
>>> @0x0000001fb8dd0000
>>> [    5.483578] arch_timer: CPU63: Trapping CNTVCT access
>>> [    5.483582] CPU63: Booted secondary processor 0x0000070303
>>> [0x410fd082]
>>> [    5.483666] smp: Brought up 4 nodes, 64 CPUs
>>> [    7.424162] SMP: Total of 64 processors activated.
>>> [    7.429006] CPU features: detected: GIC system register CPU interface
>>> [    7.435538] CPU features: detected: 32-bit EL0 Support
>>> [    7.440731] CPU features: detected: CRC32 instructions
>>> [   40.107984] CPU: All CPU(s) started at EL2
>>> [   40.112264] alternatives: patching kernel code
>>> [   40.134299] devtmpfs: initialized
>>> [   40.138196] clocksource: jiffies: mask: 0xffffffff max_cycles:
>>> 0xffffffff, max_idle_ns: 7645041785100000 ns
>>> [   40.148209] futex hash table entries: 16384 (order: 8, 1048576 bytes)
>>> [   40.155486] pinctrl core: initialized pinctrl subsystem
>>> [   40.161109] SMBIOS 3.0.0 present.
>>> [   40.164456] DMI: Huawei Taishan 2280 /D05, BIOS Hisilicon D05 IT17
>>> Nemo 2.0 RC0 10/05/2018
>>> [   40.172985] NET: Registered protocol family 16
>>> [   40.178236] audit: initializing netlink subsys (disabled)
>>> [   40.183748] audit: type=2000 audit(5.072:1): state=initialized
>>> audit_enabled=0 res=1
>>>
>>> Anyone else notice a similar issue?
>>>
>>> Our nextgen D06 board does not seem to have this issue.
>>>
>>> Thanks,
>>> John
>>>
>>>
>>> _______________________________________________
>>> linux-arm-kernel mailing list
>>> linux-arm-kernel@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>
>> .
>>
> 
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: arm64 board boot pauses on linux-next (resend)
  2018-12-12 19:16     ` Suzuki K Poulose
@ 2018-12-13 11:27       ` John Garry
  2018-12-13 11:33         ` Suzuki K Poulose
  2018-12-13 13:02         ` Will Deacon
  0 siblings, 2 replies; 11+ messages in thread
From: John Garry @ 2018-12-13 11:27 UTC (permalink / raw)
  To: Suzuki K Poulose, Robin Murphy, linux-arm-kernel; +Cc: Linuxarm

On 12/12/2018 19:16, Suzuki K Poulose wrote:
> Hi
>
> On 12/12/2018 06:07 PM, John Garry wrote:
>> On 12/12/2018 17:17, Robin Murphy wrote:
>>> [ +Suzuki ]
>>>
>>> Hi John,
>>>
>>
>> Hi Robin,
>>
>>> On 12/12/2018 16:48, John Garry wrote:
>>>> Hi all,
>>>>
>>>> I am finding our arm64 D05 board particularly slow to boot from
>>>> linux-next, specifically a 30+ second pause when setting up CPU
>>>> features.
>>>
>>> That does look like it's almost certainly spending the mystery time in
>>> setup_cpu_features() itself.
>>
>> Seems to be "Kernel page table isolation (KPTI)" feature which we hang
>> on, but I would not say that's conclusive.
>>
>
> I think I have an idea what could be happening. The cpu_enable()
> for KPTI, waits for all the secondary CPUs to enter a busy loop,
> before installing the non-global mapping. So, with the changes in
> -next, we batch the cpu_enabl() callbacks, which implies, the seconary
> CPUs end up in the "cpu_enable()" for KPTI at different times.
>
> Could you check if the following hack makes it any better ?
>

Hi,

Unfortunately it does not help:
[    5.502243] CPU63: Booted secondary processor 0x0000070303 [0x410fd082]
[    5.502329] smp: Brought up 4 nodes, 64 CPUs
[    7.442722] SMP: Total of 64 processors activated.
[    7.447567] CPU features: detected: GIC system register CPU interface
[    7.454098] CPU features: detected: 32-bit EL0 Support
[    7.459291] CPU features: detected: CRC32 instructions
[   40.236781] CPU: All CPU(s) started at EL2
[   40.241062] alternatives: patching kernel code
[   40.263213] devtmpfs: initialized
[   40.267140] clocksource: jiffies: mask: 0xffffffff max_cycles: 
0xffffffff, max_idle_ns: 7645041785100000 ns
[   40.277147] futex hash table entries: 16384 (order: 8, 1048576 bytes)
[   40.284424] pinctrl core: initialized pinctrl subsystem
[   40.290081] SMBIOS 3.0.0 present.

BTW, If you guys know the reason for this delay and it is not going to 
be an issue, then that's fine. I just wanted to raise awareness.

Cheers,
John

> ---8>---
>
> hack: Reorder KPTI to the top
>
>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  arch/arm64/include/asm/cpucaps.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/include/asm/cpucaps.h
> b/arch/arm64/include/asm/cpucaps.h
> index a89f587..1363e09 100644
> --- a/arch/arm64/include/asm/cpucaps.h
> +++ b/arch/arm64/include/asm/cpucaps.h
> @@ -18,7 +18,7 @@
>  #ifndef __ASM_CPUCAPS_H
>  #define __ASM_CPUCAPS_H
>
> -#define ARM64_WORKAROUND_CLEAN_CACHE        0
> +#define ARM64_UNMAP_KERNEL_AT_EL0        0
>  #define ARM64_WORKAROUND_DEVICE_LOAD_ACQUIRE    1
>  #define ARM64_WORKAROUND_845719            2
>  #define ARM64_HAS_SYSREG_GIC_CPUIF        3
> @@ -41,7 +41,7 @@
>  #define ARM64_WORKAROUND_CAVIUM_30115        20
>  #define ARM64_HAS_DCPOP                21
>  #define ARM64_SVE                22
> -#define ARM64_UNMAP_KERNEL_AT_EL0        23
> +#define ARM64_WORKAROUND_CLEAN_CACHE        23
>  #define ARM64_HARDEN_BRANCH_PREDICTOR        24
>  #define ARM64_HAS_RAS_EXTN            25
>  #define ARM64_WORKAROUND_843419            26



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: arm64 board boot pauses on linux-next (resend)
  2018-12-13 11:27       ` John Garry
@ 2018-12-13 11:33         ` Suzuki K Poulose
  2018-12-13 12:59           ` John Garry
  2018-12-13 13:02         ` Will Deacon
  1 sibling, 1 reply; 11+ messages in thread
From: Suzuki K Poulose @ 2018-12-13 11:33 UTC (permalink / raw)
  To: John Garry, Robin Murphy, linux-arm-kernel; +Cc: Linuxarm

Hi John,

On 13/12/2018 11:27, John Garry wrote:
> On 12/12/2018 19:16, Suzuki K Poulose wrote:
>> Hi
>>
>> On 12/12/2018 06:07 PM, John Garry wrote:
>>> On 12/12/2018 17:17, Robin Murphy wrote:
>>>> [ +Suzuki ]
>>>>
>>>> Hi John,
>>>>
>>>
>>> Hi Robin,
>>>
>>>> On 12/12/2018 16:48, John Garry wrote:
>>>>> Hi all,
>>>>>
>>>>> I am finding our arm64 D05 board particularly slow to boot from
>>>>> linux-next, specifically a 30+ second pause when setting up CPU
>>>>> features.
>>>>
>>>> That does look like it's almost certainly spending the mystery time in
>>>> setup_cpu_features() itself.
>>>
>>> Seems to be "Kernel page table isolation (KPTI)" feature which we hang
>>> on, but I would not say that's conclusive.
>>>
>>
>> I think I have an idea what could be happening. The cpu_enable()
>> for KPTI, waits for all the secondary CPUs to enter a busy loop,
>> before installing the non-global mapping. So, with the changes in
>> -next, we batch the cpu_enabl() callbacks, which implies, the seconary
>> CPUs end up in the "cpu_enable()" for KPTI at different times.
>>
>> Could you check if the following hack makes it any better ?
>>
> 
> Hi,
> 
> Unfortunately it does not help:
> [    5.502243] CPU63: Booted secondary processor 0x0000070303 [0x410fd082]
> [    5.502329] smp: Brought up 4 nodes, 64 CPUs
> [    7.442722] SMP: Total of 64 processors activated.
> [    7.447567] CPU features: detected: GIC system register CPU interface
> [    7.454098] CPU features: detected: 32-bit EL0 Support
> [    7.459291] CPU features: detected: CRC32 instructions
> [   40.236781] CPU: All CPU(s) started at EL2

Thats strange.


> [   40.241062] alternatives: patching kernel code
> [   40.263213] devtmpfs: initialized
> [   40.267140] clocksource: jiffies: mask: 0xffffffff max_cycles:
> 0xffffffff, max_idle_ns: 7645041785100000 ns
> [   40.277147] futex hash table entries: 16384 (order: 8, 1048576 bytes)
> [   40.284424] pinctrl core: initialized pinctrl subsystem
> [   40.290081] SMBIOS 3.0.0 present.
> 
> BTW, If you guys know the reason for this delay and it is not going to
> be an issue, then that's fine. I just wanted to raise awareness.
> 

It would be good to get to the bottom of it, given that it delays like 30+
seconds. Does reverting the following patch make any difference ?

"arm64: capabilities: Batch cpu_enable callbacks"

I will try to reproduce it locally, on a different platform though.

Cheers
Suzuki

> Cheers,
> John
> 
>> ---8>---
>>
>> hack: Reorder KPTI to the top
>>
>>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>> ---
>>   arch/arm64/include/asm/cpucaps.h | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/cpucaps.h
>> b/arch/arm64/include/asm/cpucaps.h
>> index a89f587..1363e09 100644
>> --- a/arch/arm64/include/asm/cpucaps.h
>> +++ b/arch/arm64/include/asm/cpucaps.h
>> @@ -18,7 +18,7 @@
>>   #ifndef __ASM_CPUCAPS_H
>>   #define __ASM_CPUCAPS_H
>>
>> -#define ARM64_WORKAROUND_CLEAN_CACHE        0
>> +#define ARM64_UNMAP_KERNEL_AT_EL0        0
>>   #define ARM64_WORKAROUND_DEVICE_LOAD_ACQUIRE    1
>>   #define ARM64_WORKAROUND_845719            2
>>   #define ARM64_HAS_SYSREG_GIC_CPUIF        3
>> @@ -41,7 +41,7 @@
>>   #define ARM64_WORKAROUND_CAVIUM_30115        20
>>   #define ARM64_HAS_DCPOP                21
>>   #define ARM64_SVE                22
>> -#define ARM64_UNMAP_KERNEL_AT_EL0        23
>> +#define ARM64_WORKAROUND_CLEAN_CACHE        23
>>   #define ARM64_HARDEN_BRANCH_PREDICTOR        24
>>   #define ARM64_HAS_RAS_EXTN            25
>>   #define ARM64_WORKAROUND_843419            26
> 
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: arm64 board boot pauses on linux-next (resend)
  2018-12-13 11:33         ` Suzuki K Poulose
@ 2018-12-13 12:59           ` John Garry
  0 siblings, 0 replies; 11+ messages in thread
From: John Garry @ 2018-12-13 12:59 UTC (permalink / raw)
  To: Suzuki K Poulose, Robin Murphy, linux-arm-kernel; +Cc: Linuxarm

On 13/12/2018 11:33, Suzuki K Poulose wrote:
> Hi John,
>
> On 13/12/2018 11:27, John Garry wrote:
>> On 12/12/2018 19:16, Suzuki K Poulose wrote:
>>> Hi
>>>
>>> On 12/12/2018 06:07 PM, John Garry wrote:
>>>> On 12/12/2018 17:17, Robin Murphy wrote:
>>>>> [ +Suzuki ]
>>>>>
>>>>> Hi John,
>>>>>
>>>>
>>>> Hi Robin,
>>>>
>>>>> On 12/12/2018 16:48, John Garry wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> I am finding our arm64 D05 board particularly slow to boot from
>>>>>> linux-next, specifically a 30+ second pause when setting up CPU
>>>>>> features.
>>>>>
>>>>> That does look like it's almost certainly spending the mystery time in
>>>>> setup_cpu_features() itself.
>>>>
>>>> Seems to be "Kernel page table isolation (KPTI)" feature which we hang
>>>> on, but I would not say that's conclusive.
>>>>
>>>
>>> I think I have an idea what could be happening. The cpu_enable()
>>> for KPTI, waits for all the secondary CPUs to enter a busy loop,
>>> before installing the non-global mapping. So, with the changes in
>>> -next, we batch the cpu_enabl() callbacks, which implies, the seconary
>>> CPUs end up in the "cpu_enable()" for KPTI at different times.
>>>
>>> Could you check if the following hack makes it any better ?
>>>
>>
>> Hi,
>>
>> Unfortunately it does not help:
>> [    5.502243] CPU63: Booted secondary processor 0x0000070303
>> [0x410fd082]
>> [    5.502329] smp: Brought up 4 nodes, 64 CPUs
>> [    7.442722] SMP: Total of 64 processors activated.
>> [    7.447567] CPU features: detected: GIC system register CPU interface
>> [    7.454098] CPU features: detected: 32-bit EL0 Support
>> [    7.459291] CPU features: detected: CRC32 instructions
>> [   40.236781] CPU: All CPU(s) started at EL2
>
> Thats strange.
>
>
>> [   40.241062] alternatives: patching kernel code
>> [   40.263213] devtmpfs: initialized
>> [   40.267140] clocksource: jiffies: mask: 0xffffffff max_cycles:
>> 0xffffffff, max_idle_ns: 7645041785100000 ns
>> [   40.277147] futex hash table entries: 16384 (order: 8, 1048576 bytes)
>> [   40.284424] pinctrl core: initialized pinctrl subsystem
>> [   40.290081] SMBIOS 3.0.0 present.
>>
>> BTW, If you guys know the reason for this delay and it is not going to
>> be an issue, then that's fine. I just wanted to raise awareness.
>>
>
> It would be good to get to the bottom of it, given that it delays like 30+
> seconds. Does reverting the following patch make any difference ?
>
> "arm64: capabilities: Batch cpu_enable callbacks"

That doesn't help either. As previously.

John

>
> I will try to reproduce it locally, on a different platform though.
>
> Cheers
> Suzuki
>
>> Cheers,
>> John
>>
>>> ---8>---
>>>
>>> hack: Reorder KPTI to the top
>>>
>>>
>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>>> ---
>>>   arch/arm64/include/asm/cpucaps.h | 4 ++--
>>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/arm64/include/asm/cpucaps.h
>>> b/arch/arm64/include/asm/cpucaps.h
>>> index a89f587..1363e09 100644
>>> --- a/arch/arm64/include/asm/cpucaps.h
>>> +++ b/arch/arm64/include/asm/cpucaps.h
>>> @@ -18,7 +18,7 @@
>>>   #ifndef __ASM_CPUCAPS_H
>>>   #define __ASM_CPUCAPS_H
>>>
>>> -#define ARM64_WORKAROUND_CLEAN_CACHE        0
>>> +#define ARM64_UNMAP_KERNEL_AT_EL0        0
>>>   #define ARM64_WORKAROUND_DEVICE_LOAD_ACQUIRE    1
>>>   #define ARM64_WORKAROUND_845719            2
>>>   #define ARM64_HAS_SYSREG_GIC_CPUIF        3
>>> @@ -41,7 +41,7 @@
>>>   #define ARM64_WORKAROUND_CAVIUM_30115        20
>>>   #define ARM64_HAS_DCPOP                21
>>>   #define ARM64_SVE                22
>>> -#define ARM64_UNMAP_KERNEL_AT_EL0        23
>>> +#define ARM64_WORKAROUND_CLEAN_CACHE        23
>>>   #define ARM64_HARDEN_BRANCH_PREDICTOR        24
>>>   #define ARM64_HAS_RAS_EXTN            25
>>>   #define ARM64_WORKAROUND_843419            26
>>
>>
>
> .
>



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: arm64 board boot pauses on linux-next (resend)
  2018-12-13 11:27       ` John Garry
  2018-12-13 11:33         ` Suzuki K Poulose
@ 2018-12-13 13:02         ` Will Deacon
  2018-12-13 13:12           ` Ard Biesheuvel
  2018-12-13 13:15           ` John Garry
  1 sibling, 2 replies; 11+ messages in thread
From: Will Deacon @ 2018-12-13 13:02 UTC (permalink / raw)
  To: John Garry
  Cc: ard.biesheuvel, Robin Murphy, Linuxarm, linux-arm-kernel,
	Suzuki K Poulose

[+Ard]

On Thu, Dec 13, 2018 at 11:27:09AM +0000, John Garry wrote:
> On 12/12/2018 19:16, Suzuki K Poulose wrote:
> > On 12/12/2018 06:07 PM, John Garry wrote:
> > > On 12/12/2018 17:17, Robin Murphy wrote:
> > > > On 12/12/2018 16:48, John Garry wrote:
> > > > > I am finding our arm64 D05 board particularly slow to boot from
> > > > > linux-next, specifically a 30+ second pause when setting up CPU
> > > > > features.
> > > > 
> > > > That does look like it's almost certainly spending the mystery time in
> > > > setup_cpu_features() itself.
> > > 
> > > Seems to be "Kernel page table isolation (KPTI)" feature which we hang
> > > on, but I would not say that's conclusive.
> > > 
> > 
> > I think I have an idea what could be happening. The cpu_enable()
> > for KPTI, waits for all the secondary CPUs to enter a busy loop,
> > before installing the non-global mapping. So, with the changes in
> > -next, we batch the cpu_enabl() callbacks, which implies, the seconary
> > CPUs end up in the "cpu_enable()" for KPTI at different times.
> > 
> > Could you check if the following hack makes it any better ?
> > 
> 
> Unfortunately it does not help:
> [    5.502243] CPU63: Booted secondary processor 0x0000070303 [0x410fd082]
> [    5.502329] smp: Brought up 4 nodes, 64 CPUs
> [    7.442722] SMP: Total of 64 processors activated.
> [    7.447567] CPU features: detected: GIC system register CPU interface
> [    7.454098] CPU features: detected: 32-bit EL0 Support
> [    7.459291] CPU features: detected: CRC32 instructions
> [   40.236781] CPU: All CPU(s) started at EL2
> [   40.241062] alternatives: patching kernel code
> [   40.263213] devtmpfs: initialized
> [   40.267140] clocksource: jiffies: mask: 0xffffffff max_cycles:
> 0xffffffff, max_idle_ns: 7645041785100000 ns
> [   40.277147] futex hash table entries: 16384 (order: 8, 1048576 bytes)
> [   40.284424] pinctrl core: initialized pinctrl subsystem
> [   40.290081] SMBIOS 3.0.0 present.
> 
> BTW, If you guys know the reason for this delay and it is not going to be an
> issue, then that's fine. I just wanted to raise awareness.

I think I've managed to reproduce the issue locally and it appears to be
because of the default rodata=full changes. One impact of the change is
that the linear map is now mapped at page granularity, so the kpti work
to convert everything to non-global takes considerably longer.

Given that most machines are not affected by meltdown, I think we should
probably look at expanding the whitelist we have, rather than pile more
complexity into the early page table code.

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: arm64 board boot pauses on linux-next (resend)
  2018-12-13 13:02         ` Will Deacon
@ 2018-12-13 13:12           ` Ard Biesheuvel
  2018-12-13 13:50             ` John Garry
  2018-12-13 13:15           ` John Garry
  1 sibling, 1 reply; 11+ messages in thread
From: Ard Biesheuvel @ 2018-12-13 13:12 UTC (permalink / raw)
  To: Will Deacon
  Cc: John Garry, Robin Murphy, Linuxarm, linux-arm-kernel, Suzuki K. Poulose

On Thu, 13 Dec 2018 at 14:01, Will Deacon <will.deacon@arm.com> wrote:
>
> [+Ard]
>
> On Thu, Dec 13, 2018 at 11:27:09AM +0000, John Garry wrote:
> > On 12/12/2018 19:16, Suzuki K Poulose wrote:
> > > On 12/12/2018 06:07 PM, John Garry wrote:
> > > > On 12/12/2018 17:17, Robin Murphy wrote:
> > > > > On 12/12/2018 16:48, John Garry wrote:
> > > > > > I am finding our arm64 D05 board particularly slow to boot from
> > > > > > linux-next, specifically a 30+ second pause when setting up CPU
> > > > > > features.
> > > > >
> > > > > That does look like it's almost certainly spending the mystery time in
> > > > > setup_cpu_features() itself.
> > > >
> > > > Seems to be "Kernel page table isolation (KPTI)" feature which we hang
> > > > on, but I would not say that's conclusive.
> > > >
> > >
> > > I think I have an idea what could be happening. The cpu_enable()
> > > for KPTI, waits for all the secondary CPUs to enter a busy loop,
> > > before installing the non-global mapping. So, with the changes in
> > > -next, we batch the cpu_enabl() callbacks, which implies, the seconary
> > > CPUs end up in the "cpu_enable()" for KPTI at different times.
> > >
> > > Could you check if the following hack makes it any better ?
> > >
> >
> > Unfortunately it does not help:
> > [    5.502243] CPU63: Booted secondary processor 0x0000070303 [0x410fd082]
> > [    5.502329] smp: Brought up 4 nodes, 64 CPUs
> > [    7.442722] SMP: Total of 64 processors activated.
> > [    7.447567] CPU features: detected: GIC system register CPU interface
> > [    7.454098] CPU features: detected: 32-bit EL0 Support
> > [    7.459291] CPU features: detected: CRC32 instructions
> > [   40.236781] CPU: All CPU(s) started at EL2
> > [   40.241062] alternatives: patching kernel code
> > [   40.263213] devtmpfs: initialized
> > [   40.267140] clocksource: jiffies: mask: 0xffffffff max_cycles:
> > 0xffffffff, max_idle_ns: 7645041785100000 ns
> > [   40.277147] futex hash table entries: 16384 (order: 8, 1048576 bytes)
> > [   40.284424] pinctrl core: initialized pinctrl subsystem
> > [   40.290081] SMBIOS 3.0.0 present.
> >
> > BTW, If you guys know the reason for this delay and it is not going to be an
> > issue, then that's fine. I just wanted to raise awareness.
>
> I think I've managed to reproduce the issue locally and it appears to be
> because of the default rodata=full changes. One impact of the change is
> that the linear map is now mapped at page granularity, so the kpti work
> to convert everything to non-global takes considerably longer.
>

Hmm. At least we can easily check if this causes the reporter's issue
as well, by passing 'rodata=on' on the kernel command line

> Given that most machines are not affected by meltdown, I think we should
> probably look at expanding the whitelist we have, rather than pile more
> complexity into the early page table code.
>

Disabling KPTI also makes KASLR easily defeatable.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: arm64 board boot pauses on linux-next (resend)
  2018-12-13 13:02         ` Will Deacon
  2018-12-13 13:12           ` Ard Biesheuvel
@ 2018-12-13 13:15           ` John Garry
  1 sibling, 0 replies; 11+ messages in thread
From: John Garry @ 2018-12-13 13:15 UTC (permalink / raw)
  To: Will Deacon
  Cc: ard.biesheuvel, Robin Murphy, Linuxarm, linux-arm-kernel,
	Suzuki K Poulose

On 13/12/2018 13:02, Will Deacon wrote:
> [+Ard]
>
> On Thu, Dec 13, 2018 at 11:27:09AM +0000, John Garry wrote:
>> On 12/12/2018 19:16, Suzuki K Poulose wrote:
>>> On 12/12/2018 06:07 PM, John Garry wrote:
>>>> On 12/12/2018 17:17, Robin Murphy wrote:
>>>>> On 12/12/2018 16:48, John Garry wrote:
>>>>>> I am finding our arm64 D05 board particularly slow to boot from
>>>>>> linux-next, specifically a 30+ second pause when setting up CPU
>>>>>> features.
>>>>>
>>>>> That does look like it's almost certainly spending the mystery time in
>>>>> setup_cpu_features() itself.
>>>>
>>>> Seems to be "Kernel page table isolation (KPTI)" feature which we hang
>>>> on, but I would not say that's conclusive.
>>>>
>>>
>>> I think I have an idea what could be happening. The cpu_enable()
>>> for KPTI, waits for all the secondary CPUs to enter a busy loop,
>>> before installing the non-global mapping. So, with the changes in
>>> -next, we batch the cpu_enabl() callbacks, which implies, the seconary
>>> CPUs end up in the "cpu_enable()" for KPTI at different times.
>>>
>>> Could you check if the following hack makes it any better ?
>>>
>>
>> Unfortunately it does not help:
>> [    5.502243] CPU63: Booted secondary processor 0x0000070303 [0x410fd082]
>> [    5.502329] smp: Brought up 4 nodes, 64 CPUs
>> [    7.442722] SMP: Total of 64 processors activated.
>> [    7.447567] CPU features: detected: GIC system register CPU interface
>> [    7.454098] CPU features: detected: 32-bit EL0 Support
>> [    7.459291] CPU features: detected: CRC32 instructions
>> [   40.236781] CPU: All CPU(s) started at EL2
>> [   40.241062] alternatives: patching kernel code
>> [   40.263213] devtmpfs: initialized
>> [   40.267140] clocksource: jiffies: mask: 0xffffffff max_cycles:
>> 0xffffffff, max_idle_ns: 7645041785100000 ns
>> [   40.277147] futex hash table entries: 16384 (order: 8, 1048576 bytes)
>> [   40.284424] pinctrl core: initialized pinctrl subsystem
>> [   40.290081] SMBIOS 3.0.0 present.
>>
>> BTW, If you guys know the reason for this delay and it is not going to be an
>> issue, then that's fine. I just wanted to raise awareness.
>
> I think I've managed to reproduce the issue locally and it appears to be
> because of the default rodata=full changes. One impact of the change is
> that the linear map is now mapped at page granularity, so the kpti work
> to convert everything to non-global takes considerably longer.
>
> Given that most machines are not affected by meltdown, I think we should
> probably look at expanding the whitelist we have, rather than pile more
> complexity into the early page table code.

OK, good to know.

This was my bisect log:
git bisect start
# bad: [606f8e7b27bfe30376348f8bc09cba17626dc24c] arm64: capabilities: 
Use linear array for detection and verification
git bisect bad 606f8e7b27bfe30376348f8bc09cba17626dc24c
# good: [40e020c129cfc991e8ab4736d2665351ffd1468d] Linux 4.20-rc6
git bisect good 40e020c129cfc991e8ab4736d2665351ffd1468d
# good: [9ff01193a20d391e8dbce4403dd5ef87c7eaaca6] Linux 4.20-rc3
git bisect good 9ff01193a20d391e8dbce4403dd5ef87c7eaaca6
# bad: [ad697a1aecac19ec351063b5d8e6fc9d4bca7ee5] linkage: add generic 
GLOBAL() macro
git bisect bad ad697a1aecac19ec351063b5d8e6fc9d4bca7ee5
# skip: [9eb1c92b47c73249465d388eaa394fe436a3b489] arm64: acpi: Prepare 
for longer MADTs
git bisect skip 9eb1c92b47c73249465d388eaa394fe436a3b489
# skip: [d8797b125711f23d83f5a71e908d34dfcd1fc3e9] arm64: Use a raw 
spinlock in __install_bp_hardening_cb()
git bisect skip d8797b125711f23d83f5a71e908d34dfcd1fc3e9
# bad: [c8ebf64eab743130fe404dc6679c2ff0cbc01615] arm64/module: use plt 
section indices for relocations
git bisect bad c8ebf64eab743130fe404dc6679c2ff0cbc01615
# good: [b34d2ef0c60e4d9c2bb8a4d72d4519c67363d390] arm64: mm: purge 
lazily unmapped vm regions before changing permissions
git bisect good b34d2ef0c60e4d9c2bb8a4d72d4519c67363d390
# bad: [c55191e96caa9d787e8f682c5e525b7f8172a3b4] arm64: mm: apply r/o 
permissions of VM areas to its linear alias as well
git bisect bad c55191e96caa9d787e8f682c5e525b7f8172a3b4
# first bad commit: [c55191e96caa9d787e8f682c5e525b7f8172a3b4] arm64: 
mm: apply r/o permissions of VM areas to its linear alias as well

It was inconclusive from testing and I was getting tried of rerunning, 
so sent the mail...

Cheers,
John

>
> Will
>
> .
>



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: arm64 board boot pauses on linux-next (resend)
  2018-12-13 13:12           ` Ard Biesheuvel
@ 2018-12-13 13:50             ` John Garry
  0 siblings, 0 replies; 11+ messages in thread
From: John Garry @ 2018-12-13 13:50 UTC (permalink / raw)
  To: Ard Biesheuvel, Will Deacon
  Cc: Robin Murphy, Linuxarm, linux-arm-kernel, Suzuki K. Poulose

On 13/12/2018 13:12, Ard Biesheuvel wrote:
> On Thu, 13 Dec 2018 at 14:01, Will Deacon <will.deacon@arm.com> wrote:
>>
>> [+Ard]
>>
>> On Thu, Dec 13, 2018 at 11:27:09AM +0000, John Garry wrote:
>>> On 12/12/2018 19:16, Suzuki K Poulose wrote:
>>>> On 12/12/2018 06:07 PM, John Garry wrote:
>>>>> On 12/12/2018 17:17, Robin Murphy wrote:
>>>>>> On 12/12/2018 16:48, John Garry wrote:
>>>>>>> I am finding our arm64 D05 board particularly slow to boot from
>>>>>>> linux-next, specifically a 30+ second pause when setting up CPU
>>>>>>> features.
>>>>>>
>>>>>> That does look like it's almost certainly spending the mystery time in
>>>>>> setup_cpu_features() itself.
>>>>>
>>>>> Seems to be "Kernel page table isolation (KPTI)" feature which we hang
>>>>> on, but I would not say that's conclusive.
>>>>>
>>>>
>>>> I think I have an idea what could be happening. The cpu_enable()
>>>> for KPTI, waits for all the secondary CPUs to enter a busy loop,
>>>> before installing the non-global mapping. So, with the changes in
>>>> -next, we batch the cpu_enabl() callbacks, which implies, the seconary
>>>> CPUs end up in the "cpu_enable()" for KPTI at different times.
>>>>
>>>> Could you check if the following hack makes it any better ?
>>>>
>>>
>>> Unfortunately it does not help:
>>> [    5.502243] CPU63: Booted secondary processor 0x0000070303 [0x410fd082]
>>> [    5.502329] smp: Brought up 4 nodes, 64 CPUs
>>> [    7.442722] SMP: Total of 64 processors activated.
>>> [    7.447567] CPU features: detected: GIC system register CPU interface
>>> [    7.454098] CPU features: detected: 32-bit EL0 Support
>>> [    7.459291] CPU features: detected: CRC32 instructions
>>> [   40.236781] CPU: All CPU(s) started at EL2
>>> [   40.241062] alternatives: patching kernel code
>>> [   40.263213] devtmpfs: initialized
>>> [   40.267140] clocksource: jiffies: mask: 0xffffffff max_cycles:
>>> 0xffffffff, max_idle_ns: 7645041785100000 ns
>>> [   40.277147] futex hash table entries: 16384 (order: 8, 1048576 bytes)
>>> [   40.284424] pinctrl core: initialized pinctrl subsystem
>>> [   40.290081] SMBIOS 3.0.0 present.
>>>
>>> BTW, If you guys know the reason for this delay and it is not going to be an
>>> issue, then that's fine. I just wanted to raise awareness.
>>
>> I think I've managed to reproduce the issue locally and it appears to be
>> because of the default rodata=full changes. One impact of the change is
>> that the linear map is now mapped at page granularity, so the kpti work
>> to convert everything to non-global takes considerably longer.
>>
>
> Hmm. At least we can easily check if this causes the reporter's issue
> as well, by passing 'rodata=on' on the kernel command line

That fixed it.

John

>
>> Given that most machines are not affected by meltdown, I think we should
>> probably look at expanding the whitelist we have, rather than pile more
>> complexity into the early page table code.
>>
>
> Disabling KPTI also makes KASLR easily defeatable.
>
> .
>



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-12-13 13:50 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-12 16:48 arm64 board boot pauses on linux-next (resend) John Garry
2018-12-12 17:17 ` Robin Murphy
2018-12-12 18:07   ` John Garry
2018-12-12 19:16     ` Suzuki K Poulose
2018-12-13 11:27       ` John Garry
2018-12-13 11:33         ` Suzuki K Poulose
2018-12-13 12:59           ` John Garry
2018-12-13 13:02         ` Will Deacon
2018-12-13 13:12           ` Ard Biesheuvel
2018-12-13 13:50             ` John Garry
2018-12-13 13:15           ` John Garry

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.