linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Random crashes with i386 and efi boots
@ 2018-09-10 21:56 Guenter Roeck
  2018-09-11 11:52 ` Andy Lutomirski
  0 siblings, 1 reply; 9+ messages in thread
From: Guenter Roeck @ 2018-09-10 21:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ard Biesheuvel, Joerg Roedel, Thomas Gleixner, Michal Hocko,
	Andi Kleen, Linus Torvalds, Dave Hansen, Pavel Machek, linux-efi,
	x86

Hi folks,

even after commit eeb89e2bb1ac ("x86/efi: Load fixmap GDT in
efi_call_phys_epilog()"), my i386/efi qemu boot tests still crash randomly
(roughly 5-10% of the time). As before, I don't see much useful output in
the qemu log (this time it doesn't even complain about a triple fault). 

Debugging shows that the crash happens in efi_call_phys_epilog().
A sample log from a crashed test run is attached below. It appears that
the crash happens if there is an interrupt at a critical section of the
code.

While playing with the code, I found a possible fix.

diff --git a/arch/x86/platform/efi/efi_32.c b/arch/x86/platform/efi/efi_32.c
index 05ca14222463..9959657127f4 100644
--- a/arch/x86/platform/efi/efi_32.c
+++ b/arch/x86/platform/efi/efi_32.c
@@ -85,10 +85,9 @@ pgd_t * __init efi_call_phys_prolog(void)
 
 void __init efi_call_phys_epilog(pgd_t *save_pgd)
 {
+	load_fixmap_gdt(0);
 	load_cr3(save_pgd);
 	__flush_tlb_all();
-
-	load_fixmap_gdt(0);
 }

This restores the execution order prior to commit eeb89e2bb1ac.
I have no real idea what I am doing, so this change is to be taken with
a grain of salt. All I can say is that 100 boots with the above change
were successful while the current upstream code (v4.19-rc3) crashes on
a regular basis (in a controlled test I observed 6 failures out of 100
boots).

It would be great if someone with a bit more experience can have another
look and figure out the underlying problem. Please let me know if I can
provide additional information.

Thanks,
Guenter

----------------
IN:
# efi_call_phys_epilog(save_pgd);
0xd8f9c12c:  8b 45 f0                 movl     -0x10(%ebp), %eax
0xd8f9c12f:  e8 49 01 00 00           calll    0xd8f9c27d

----------------
IN:
# efi_call_phys_epilog():
# load_cr3();
0xd8f9c27d:  55                       pushl    %ebp
0xd8f9c27e:  05 00 00 00 40           addl     $0x40000000, %eax
0xd8f9c283:  89 e5                    movl     %esp, %ebp
0xd8f9c285:  0f 22 d8                 movl     %eax, %cr3

CR3 update: CR3=1904e000
----------------
IN:
# __flush_tlb_all();
0xd8f9c288:  e8 c8 5e 2b ff           calll    0xd8252155

EAX=1904e000 EBX=00000000 ECX=d8f9c126 EDX=d8eafd60
ESI=1f09a000 EDI=00000030 EBP=d8e99f3c ESP=d8e99f3c
EIP=d8f9c288 EFL=00200207 [-----PC] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =007b 00000000 ffffffff 00cff300 DPL=3 DS   [-WA]
CS =0060 00000000 ffffffff 00cf9a00 DPL=0 CS32 [-R-]
SS =0068 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
DS =007b 00000000 ffffffff 00cff300 DPL=3 DS   [-WA]
FS =00d8 06c92000 ffffffff 008f9300 DPL=0 DS16 [-WA]
GS =00e0 dfcd09c0 00000018 00409100 DPL=0 DS   [--A]
LDT=0000 00000000 00000000 00008200 DPL=0 LDT
TR =0080 ff803000 0000206b 00008900 DPL=0 TSS32-avl
GDT=     1fcc0000 000000ff
IDT=     ff800000 000007ff
CR0=80050033 CR2=ffd17000 CR3=1904e000 CR4=00040690
DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000
DR6=ffff0ff0 DR7=00000400
CCS=40000000 CCD=1904e000 CCO=ADDL
EFER=0000000000000000
Servicing hardware INT=0x30
  508: v=30 e=0000 i=0 cpl=0 IP=0060:d8f9c288 pc=d8f9c288 SP=0068:d8e99f3c env->regs[R_EAX]=1904e000
X=1904e000 EBX=00000000 ECX=d8f9c126 EDX=d8eafd60
ESI=1f09a000 EDI=00000030 EBP=d8e99f3c ESP=d8e99f3c
EIP=d8f9c288 EFL=00200207 [-----PC] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =007b 00000000 ffffffff 00cff300 DPL=3 DS   [-WA]
CS =0060 00000000 ffffffff 00cf9a00 DPL=0 CS32 [-R-]
SS =0068 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
DS =007b 00000000 ffffffff 00cff300 DPL=3 DS   [-WA]
FS =00d8 06c92000 ffffffff 008f9300 DPL=0 DS16 [-WA]
GS =00e0 dfcd09c0 00000018 00409100 DPL=0 DS   [--A]
LDT=0000 00000000 00000000 00008200 DPL=0 LDT
TR =0080 ff803000 0000206b 00008900 DPL=0 TSS32-avl
GDT=     1fcc0000 000000ff
IDT=     ff800000 000007ff
CR0=80050033 CR2=ffd17000 CR3=1904e000 CR4=00040690
DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000
DR6=ffff0ff0 DR7=00000400
CCS=00000005 CCD=1904e000 CCO=EFLAGS
EFER=0000000000000000
check_exception old: 0xffffffff new 0xe
  509: v=0e e=0000 i=0 cpl=0 IP=0060:d8f9c288 pc=d8f9c288 SP=0068:d8e99f3c CR2=1fcc0060
EAX=1904e000 EBX=00000000 ECX=d8f9c126 EDX=d8eafd60
ESI=1f09a000 EDI=00000030 EBP=d8e99f3c ESP=d8e99f3c
EIP=d8f9c288 EFL=00200207 [-----PC] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =007b 00000000 ffffffff 00cff300 DPL=3 DS   [-WA]
CS =0060 00000000 ffffffff 00cf9a00 DPL=0 CS32 [-R-]
SS =0068 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
DS =007b 00000000 ffffffff 00cff300 DPL=3 DS   [-WA]
FS =00d8 06c92000 ffffffff 008f9300 DPL=0 DS16 [-WA]
GS =00e0 dfcd09c0 00000018 00409100 DPL=0 DS   [--A]
LDT=0000 00000000 00000000 00008200 DPL=0 LDT
TR =0080 ff803000 0000206b 00008900 DPL=0 TSS32-avl
GDT=     1fcc0000 000000ff
IDT=     ff800000 000007ff
CR0=80050033 CR2=1fcc0060 CR3=1904e000 CR4=00040690
DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000
DR6=ffff0ff0 DR7=00000400
CCS=00000005 CCD=1904e000 CCO=EFLAGS
EFER=0000000000000000
check_exception old: 0xe new 0xe
  510: v=08 e=0000 i=0 cpl=0 IP=0060:d8f9c288 pc=d8f9c288 SP=0068:d8e99f3c env->regs[R_EAX]=1904e000
EAX=1904e000 EBX=00000000 ECX=d8f9c126 EDX=d8eafd60
ESI=1f09a000 EDI=00000030 EBP=d8e99f3c ESP=d8e99f3c
EIP=d8f9c288 EFL=00200207 [-----PC] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =007b 00000000 ffffffff 00cff300 DPL=3 DS   [-WA]
CS =0060 00000000 ffffffff 00cf9a00 DPL=0 CS32 [-R-]
SS =0068 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
DS =007b 00000000 ffffffff 00cff300 DPL=3 DS   [-WA]
FS =00d8 06c92000 ffffffff 008f9300 DPL=0 DS16 [-WA]
GS =00e0 dfcd09c0 00000018 00409100 DPL=0 DS   [--A]
LDT=0000 00000000 00000000 00008200 DPL=0 LDT
TR =0080 ff803000 0000206b 00008900 DPL=0 TSS32-avl
GDT=     1fcc0000 000000ff
IDT=     ff800000 000007ff
CR0=80050033 CR2=1fcc0060 CR3=1904e000 CR4=00040690
DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000
DR6=ffff0ff0 DR7=00000400
CCS=00000005 CCD=1904e000 CCO=EFLAGS
EFER=0000000000000000
check_exception old: 0x8 new 0xe

# qemu dies silently here

---

In another log, the code proceeeds a little further:

CR3 update: CR3=13668000
----------------
IN:
0xd35b6412:  e8 ae cb 29 ff           calll    0xd2852fc5

----------------
IN:
0xd35b6417:  31 c0                    xorl     %eax, %eax
0xd35b6419:  e8 12 2e 27 ff           calll    0xd2829230

----------------
IN:
0xd2829230:  55                       pushl    %ebp
0xd2829231:  89 e5                    movl     %esp, %ebp
0xd2829233:  83 ec 08                 subl     $8, %esp
0xd2829236:  e8 75 6c 02 00           calll    0xd284feb0

Servicing hardware INT=0x30

... subsequent log messages and crash are as before.

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: Random crashes with i386 and efi boots
  2018-09-10 21:56 Random crashes with i386 and efi boots Guenter Roeck
@ 2018-09-11 11:52 ` Andy Lutomirski
  2018-09-11 13:30   ` Guenter Roeck
  0 siblings, 1 reply; 9+ messages in thread
From: Andy Lutomirski @ 2018-09-11 11:52 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: linux-kernel, Ard Biesheuvel, Joerg Roedel, Thomas Gleixner,
	Michal Hocko, Andi Kleen, Linus Torvalds, Dave Hansen,
	Pavel Machek, linux-efi, x86



> On Sep 10, 2018, at 2:56 PM, Guenter Roeck <linux@roeck-us.net> wrote:
> 
> Hi folks,
> 
> even after commit eeb89e2bb1ac ("x86/efi: Load fixmap GDT in
> efi_call_phys_epilog()"), my i386/efi qemu boot tests still crash randomly
> (roughly 5-10% of the time). As before, I don't see much useful output in
> the qemu log (this time it doesn't even complain about a triple fault). 
> 
> Debugging shows that the crash happens in efi_call_phys_epilog().
> A sample log from a crashed test run is attached below. It appears that
> the crash happens if there is an interrupt at a critical section of the
> code.
> 
> While playing with the code, I found a possible fix.
> 
> diff --git a/arch/x86/platform/efi/efi_32.c b/arch/x86/platform/efi/efi_32.c
> index 05ca14222463..9959657127f4 100644
> --- a/arch/x86/platform/efi/efi_32.c
> +++ b/arch/x86/platform/efi/efi_32.c
> @@ -85,10 +85,9 @@ pgd_t * __init efi_call_phys_prolog(void)
> 
> void __init efi_call_phys_epilog(pgd_t *save_pgd)
> {
> +    load_fixmap_gdt(0);
>    load_cr3(save_pgd);
>    __flush_tlb_all();
> -
> -    load_fixmap_gdt(0);
> }

We have IRQs on here?  It seems plausible that we’re in a window where the EFI pgd doesn’t have cpu_entry_area mapped. Also, the hard coded CPU 0 is suspicious.

Maybe try instrumenting the code to check whether the clone_pgd_range calls in setup_percpu.c have happened yet?

Your patch may well be correct, but, if we have IRQs on, we should really have cpu_entry_area mapped in both pgds.

Or we could turn off IRQs. Why on Earth are IRQs on in a context where the fixmap gdt is unusable?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Random crashes with i386 and efi boots
  2018-09-11 11:52 ` Andy Lutomirski
@ 2018-09-11 13:30   ` Guenter Roeck
  2018-09-11 16:36     ` Andy Lutomirski
  0 siblings, 1 reply; 9+ messages in thread
From: Guenter Roeck @ 2018-09-11 13:30 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: linux-kernel, Ard Biesheuvel, Joerg Roedel, Thomas Gleixner,
	Michal Hocko, Andi Kleen, Linus Torvalds, Dave Hansen,
	Pavel Machek, linux-efi, x86

On 09/11/2018 04:52 AM, Andy Lutomirski wrote:
> 
> 
>> On Sep 10, 2018, at 2:56 PM, Guenter Roeck <linux@roeck-us.net> wrote:
>>
>> Hi folks,
>>
>> even after commit eeb89e2bb1ac ("x86/efi: Load fixmap GDT in
>> efi_call_phys_epilog()"), my i386/efi qemu boot tests still crash randomly
>> (roughly 5-10% of the time). As before, I don't see much useful output in
>> the qemu log (this time it doesn't even complain about a triple fault).
>>
>> Debugging shows that the crash happens in efi_call_phys_epilog().
>> A sample log from a crashed test run is attached below. It appears that
>> the crash happens if there is an interrupt at a critical section of the
>> code.
>>
>> While playing with the code, I found a possible fix.
>>
>> diff --git a/arch/x86/platform/efi/efi_32.c b/arch/x86/platform/efi/efi_32.c
>> index 05ca14222463..9959657127f4 100644
>> --- a/arch/x86/platform/efi/efi_32.c
>> +++ b/arch/x86/platform/efi/efi_32.c
>> @@ -85,10 +85,9 @@ pgd_t * __init efi_call_phys_prolog(void)
>>
>> void __init efi_call_phys_epilog(pgd_t *save_pgd)
>> {
>> +    load_fixmap_gdt(0);
>>     load_cr3(save_pgd);
>>     __flush_tlb_all();
>> -
>> -    load_fixmap_gdt(0);
>> }
> 
> We have IRQs on here?  It seems plausible that we’re in a window where the EFI pgd doesn’t have cpu_entry_area mapped. Also, the hard coded CPU 0 is suspicious.
> 
The hard coded CPU 0 was always there. The call is ultimately from
efi_enter_virtual_mode(), which is called from start_kernel().
so presumably it is guaranteed to run on CPU 0.

> Maybe try instrumenting the code to check whether the clone_pgd_range calls in setup_percpu.c have happened yet?
> 
The crash is seen late in the boot process, so I am quite sure it happened,
but I can add a check if needed. I think that might be a different problem,
though.

> Your patch may well be correct, but, if we have IRQs on, we should really have cpu_entry_area mapped in both pgds.
> 
> Or we could turn off IRQs. Why on Earth are IRQs on in a context where the fixmap gdt is unusable?
> 

 From arch/x86/platform/efi/efi.c:phys_efi_set_virtual_address_map():

	save_pgd = efi_call_phys_prolog();
	local_irq_save(flags);
	status = efi_call_phys(...);
         local_irq_restore(flags);

         efi_call_phys_epilog(save_pgd);

So, yes, interrupts are very much enabled.

I ran several additional test sequences. With above patch, no failures with
500 boots. Without it, failure rate (long term average) across 500 boots
is around 10%. Another data point: Moving load_fixmap_gdt(0); after
load_cr3(save_pgd); does not help; it has to come first.

Guenter

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Random crashes with i386 and efi boots
  2018-09-11 13:30   ` Guenter Roeck
@ 2018-09-11 16:36     ` Andy Lutomirski
  2018-09-11 17:41       ` Joerg Roedel
  0 siblings, 1 reply; 9+ messages in thread
From: Andy Lutomirski @ 2018-09-11 16:36 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: linux-kernel, Ard Biesheuvel, Joerg Roedel, Thomas Gleixner,
	Michal Hocko, Andi Kleen, Linus Torvalds, Dave Hansen,
	Pavel Machek, linux-efi, x86



> On Sep 11, 2018, at 6:30 AM, Guenter Roeck <linux@roeck-us.net> wrote:
> 
> On 09/11/2018 04:52 AM, Andy Lutomirski wrote:
>>> On Sep 10, 2018, at 2:56 PM, Guenter Roeck <linux@roeck-us.net> wrote:
>>> 
>>> Hi folks,
>>> 
>>> even after commit eeb89e2bb1ac ("x86/efi: Load fixmap GDT in
>>> efi_call_phys_epilog()"), my i386/efi qemu boot tests still crash randomly
>>> (roughly 5-10% of the time). As before, I don't see much useful output in
>>> the qemu log (this time it doesn't even complain about a triple fault).
>>> 
>>> Debugging shows that the crash happens in efi_call_phys_epilog().
>>> A sample log from a crashed test run is attached below. It appears that
>>> the crash happens if there is an interrupt at a critical section of the
>>> code.
>>> 
>>> While playing with the code, I found a possible fix.
>>> 
>>> diff --git a/arch/x86/platform/efi/efi_32.c b/arch/x86/platform/efi/efi_32.c
>>> index 05ca14222463..9959657127f4 100644
>>> --- a/arch/x86/platform/efi/efi_32.c
>>>  +++ b/arch/x86/platform/efi/efi_32.c
>>> @@ -85,10 +85,9 @@ pgd_t * __init efi_call_phys_prolog(void)
>>> 
>>> void __init efi_call_phys_epilog(pgd_t *save_pgd)
>>> {
>>> +    load_fixmap_gdt(0);
>>>    load_cr3(save_pgd);
>>>    __flush_tlb_all();
>>> -
>>> -    load_fixmap_gdt(0);
>>> }
>> We have IRQs on here?  It seems plausible that we’re in a window where the EFI pgd doesn’t have cpu_entry_area mapped. Also, the hard coded CPU 0 is suspicious.
> The hard coded CPU 0 was always there. The call is ultimately from
> efi_enter_virtual_mode(), which is called from start_kernel().
> so presumably it is guaranteed to run on CPU 0.
> 
>> Maybe try instrumenting the code to check whether the clone_pgd_range calls in setup_percpu.c have happened yet?
> The crash is seen late in the boot process, so I am quite sure it happened,
> but I can add a check if needed. I think that might be a different problem,
> though.
> 
>> Your patch may well be correct, but, if we have IRQs on, we should really have cpu_entry_area mapped in both pgds.
>> Or we could turn off IRQs. Why on Earth are IRQs on in a context where the fixmap gdt is unusable?
> 
> From arch/x86/platform/efi/efi.c:phys_efi_set_virtual_address_map():
> 
>    save_pgd = efi_call_phys_prolog();
>    local_irq_save(flags);
>    status = efi_call_phys(...);
>        local_irq_restore(flags);
> 
>        efi_call_phys_epilog(save_pgd);
> 
> So, yes, interrupts are very much enabled.

Does fixing that solve the problem?  It seems more robust.

> 
> I ran several additional test sequences. With above patch, no failures with
> 500 boots. Without it, failure rate (long term average) across 500 boots
> is around 10%. Another data point: Moving load_fixmap_gdt(0); after
> load_cr3(save_pgd); does not help; it has to come first.
> 
> Guenter

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Random crashes with i386 and efi boots
  2018-09-11 16:36     ` Andy Lutomirski
@ 2018-09-11 17:41       ` Joerg Roedel
  2018-09-11 17:59         ` Linus Torvalds
                           ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Joerg Roedel @ 2018-09-11 17:41 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Guenter Roeck, linux-kernel, Ard Biesheuvel, Thomas Gleixner,
	Michal Hocko, Andi Kleen, Linus Torvalds, Dave Hansen,
	Pavel Machek, linux-efi, x86

On Tue, Sep 11, 2018 at 09:36:51AM -0700, Andy Lutomirski wrote:
> >    save_pgd = efi_call_phys_prolog();
> >    local_irq_save(flags);
> >    status = efi_call_phys(...);
> >        local_irq_restore(flags);
> > 
> >        efi_call_phys_epilog(save_pgd);
> > 
> > So, yes, interrupts are very much enabled.
> 
> Does fixing that solve the problem?  It seems more robust.

The problem is still that in efi_call_phys_prolog() we load the gdt with
its physical address, and when we reload the %cr3 in _epilog from
initial_page_table to swapper_pg_dir again the gdt is no longer mapped.
Blocking interrupts is more robust, but we can't block NMIs that way
that would also trigger the issue, no?

So I am in favor of changing the order in efi_call_phys_epilog() too.

Regards,

	Joerg

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Random crashes with i386 and efi boots
  2018-09-11 17:41       ` Joerg Roedel
@ 2018-09-11 17:59         ` Linus Torvalds
  2018-09-11 17:59         ` Guenter Roeck
  2018-09-11 18:05         ` Andy Lutomirski
  2 siblings, 0 replies; 9+ messages in thread
From: Linus Torvalds @ 2018-09-11 17:59 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Andy Lutomirski, Guenter Roeck, Linux Kernel Mailing List,
	Ard Biesheuvel, Thomas Gleixner, Michal Hocko, Andi Kleen,
	Dave Hansen, Pavel Machek, linux-efi, the arch/x86 maintainers

On Tue, Sep 11, 2018 at 7:42 AM Joerg Roedel <jroedel@suse.de> wrote:
>
> So I am in favor of changing the order in efi_call_phys_epilog() too.

Yeah, the only reason the order was changed was that it seemed more
logical to use the new cr3 to load the GDT. But that was clearly
wrong.

So ack on just changing the order, and my bad for suggesting the
change in the first place.

            Linus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Random crashes with i386 and efi boots
  2018-09-11 17:41       ` Joerg Roedel
  2018-09-11 17:59         ` Linus Torvalds
@ 2018-09-11 17:59         ` Guenter Roeck
  2018-09-11 18:05         ` Andy Lutomirski
  2 siblings, 0 replies; 9+ messages in thread
From: Guenter Roeck @ 2018-09-11 17:59 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Andy Lutomirski, linux-kernel, Ard Biesheuvel, Thomas Gleixner,
	Michal Hocko, Andi Kleen, Linus Torvalds, Dave Hansen,
	Pavel Machek, linux-efi, x86

On Tue, Sep 11, 2018 at 07:41:58PM +0200, Joerg Roedel wrote:
> On Tue, Sep 11, 2018 at 09:36:51AM -0700, Andy Lutomirski wrote:
> > >    save_pgd = efi_call_phys_prolog();
> > >    local_irq_save(flags);
> > >    status = efi_call_phys(...);
> > >        local_irq_restore(flags);
> > > 
> > >        efi_call_phys_epilog(save_pgd);
> > > 
> > > So, yes, interrupts are very much enabled.
> > 
> > Does fixing that solve the problem?  It seems more robust.
> 
> The problem is still that in efi_call_phys_prolog() we load the gdt with
> its physical address, and when we reload the %cr3 in _epilog from
> initial_page_table to swapper_pg_dir again the gdt is no longer mapped.
> Blocking interrupts is more robust, but we can't block NMIs that way
> that would also trigger the issue, no?
> 
> So I am in favor of changing the order in efi_call_phys_epilog() too.
> 
Blocking the interrupts across efi_call_phys_prolog() / efi_call_phys() /
efi_call_phys_epilog() does solve the issue, but I am concerned about its
negative effect on 64-bit operation (not sure if blocking interrupts
in the prolog/epilog functions there is ok). Adding another set of
local_irq_save() / local_irq_restore() to efi_call_phys_epilog() seems
wasteful if the problem can be solved by reordering instructions.

So, yes, I also prefer changing the order in efi_call_phys_epilog().
I'll submit a patch along that line.

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Random crashes with i386 and efi boots
  2018-09-11 17:41       ` Joerg Roedel
  2018-09-11 17:59         ` Linus Torvalds
  2018-09-11 17:59         ` Guenter Roeck
@ 2018-09-11 18:05         ` Andy Lutomirski
  2018-09-11 18:22           ` Guenter Roeck
  2 siblings, 1 reply; 9+ messages in thread
From: Andy Lutomirski @ 2018-09-11 18:05 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Guenter Roeck, linux-kernel, Ard Biesheuvel, Thomas Gleixner,
	Michal Hocko, Andi Kleen, Linus Torvalds, Dave Hansen,
	Pavel Machek, linux-efi, x86



> On Sep 11, 2018, at 10:41 AM, Joerg Roedel <jroedel@suse.de> wrote:
> 
> On Tue, Sep 11, 2018 at 09:36:51AM -0700, Andy Lutomirski wrote:
>>>   save_pgd = efi_call_phys_prolog();
>>>   local_irq_save(flags);
>>>   status = efi_call_phys(...);
>>>       local_irq_restore(flags);
>>> 
>>>       efi_call_phys_epilog(save_pgd);
>>> 
>>> So, yes, interrupts are very much enabled.
>> 
>> Does fixing that solve the problem?  It seems more robust.
> 
> The problem is still that in efi_call_phys_prolog() we load the gdt with
> its physical address, and when we reload the %cr3 in _epilog from
> initial_page_table to swapper_pg_dir again the gdt is no longer mapped.
> Blocking interrupts is more robust, but we can't block NMIs that way
> that would also trigger the issue, no?
> 
> So I am in favor of changing the order in efi_call_phys_epilog() too.
> 

I’m rather confused here.  We’re loading CR3 with page tables that don’t have the fixmap mapped?  With interrupts on?  And we expect it to work?  This is *nuts*.

There are IMO only three sane fixes here:

1. Load the fixmap, cpu_entry_area, etc into the EFI page table.  Drop the GDT reload entirely.

2. Do this whole virtual map dance earlier so we don’t have IRQs and NMIs and such. Maybe while we’re still using the initial page table?

3. Just identity map all the EFI regions. Make EFI page tables that literally map them at their physical addresses *and* map the entire kernel, just like we do for normal user mms.

Sure, as a stopgap, turning off IRQs and applying Guenter’s patch seems okay, but this code is not okay.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Random crashes with i386 and efi boots
  2018-09-11 18:05         ` Andy Lutomirski
@ 2018-09-11 18:22           ` Guenter Roeck
  0 siblings, 0 replies; 9+ messages in thread
From: Guenter Roeck @ 2018-09-11 18:22 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Joerg Roedel, linux-kernel, Ard Biesheuvel, Thomas Gleixner,
	Michal Hocko, Andi Kleen, Linus Torvalds, Dave Hansen,
	Pavel Machek, linux-efi, x86

On Tue, Sep 11, 2018 at 11:05:25AM -0700, Andy Lutomirski wrote:
> 
> 
> > On Sep 11, 2018, at 10:41 AM, Joerg Roedel <jroedel@suse.de> wrote:
> > 
> > On Tue, Sep 11, 2018 at 09:36:51AM -0700, Andy Lutomirski wrote:
> >>>   save_pgd = efi_call_phys_prolog();
> >>>   local_irq_save(flags);
> >>>   status = efi_call_phys(...);
> >>>       local_irq_restore(flags);
> >>> 
> >>>       efi_call_phys_epilog(save_pgd);
> >>> 
> >>> So, yes, interrupts are very much enabled.
> >> 
> >> Does fixing that solve the problem?  It seems more robust.
> > 
> > The problem is still that in efi_call_phys_prolog() we load the gdt with
> > its physical address, and when we reload the %cr3 in _epilog from
> > initial_page_table to swapper_pg_dir again the gdt is no longer mapped.
> > Blocking interrupts is more robust, but we can't block NMIs that way
> > that would also trigger the issue, no?
> > 
> > So I am in favor of changing the order in efi_call_phys_epilog() too.
> > 
> 
> I’m rather confused here.  We’re loading CR3 with page tables that don’t have the fixmap mapped?  With interrupts on?  And we expect it to work?  This is *nuts*.
> 
> There are IMO only three sane fixes here:
> 
> 1. Load the fixmap, cpu_entry_area, etc into the EFI page table.  Drop the GDT reload entirely.
> 
> 2. Do this whole virtual map dance earlier so we don’t have IRQs and NMIs and such. Maybe while we’re still using the initial page table?
> 
> 3. Just identity map all the EFI regions. Make EFI page tables that literally map them at their physical addresses *and* map the entire kernel, just like we do for normal user mms.
> 
> Sure, as a stopgap, turning off IRQs and applying Guenter’s patch seems okay, but this code is not okay.

I submitted a patch with the diff I suggested above; it seems to be the
least invasive solution and addresses the immediate problem.

I am way out of league regarding the other suggested changes. I'll be happy
to test the code if someone is willing to rearrange the code accordingly,
but I don't think it would make sense to even try doing it myself.

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-09-11 18:22 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-10 21:56 Random crashes with i386 and efi boots Guenter Roeck
2018-09-11 11:52 ` Andy Lutomirski
2018-09-11 13:30   ` Guenter Roeck
2018-09-11 16:36     ` Andy Lutomirski
2018-09-11 17:41       ` Joerg Roedel
2018-09-11 17:59         ` Linus Torvalds
2018-09-11 17:59         ` Guenter Roeck
2018-09-11 18:05         ` Andy Lutomirski
2018-09-11 18:22           ` Guenter Roeck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).