linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Michael Ellerman <mpe@ellerman.id.au>
To: Daniel Axtens <dja@axtens.net>, linuxppc-dev@lists.ozlabs.org
Cc: aneesh.kumar@linux.ibm.com
Subject: Re: [PATCH 5/6] powerpc/mm/64s/hash: Add real-mode change_memory_range() for hash LPAR
Date: Fri, 19 Mar 2021 22:56:47 +1100	[thread overview]
Message-ID: <871rcb8igw.fsf@mpe.ellerman.id.au> (raw)
In-Reply-To: <87h7m8pyk5.fsf@dja-thinkpad.axtens.net>

Daniel Axtens <dja@axtens.net> writes:
> Michael Ellerman <mpe@ellerman.id.au> writes:
>
>> When we enabled STRICT_KERNEL_RWX we received some reports of boot
>> failures when using the Hash MMU and running under phyp. The crashes
>> are intermittent, and often exhibit as a completely unresponsive
>> system, or possibly an oops.
>>
>> One example, which was caught in xmon:
>>
>>   [   14.068327][    T1] devtmpfs: mounted
>>   [   14.069302][    T1] Freeing unused kernel memory: 5568K
>>   [   14.142060][  T347] BUG: Unable to handle kernel instruction fetch
>>   [   14.142063][    T1] Run /sbin/init as init process
>>   [   14.142074][  T347] Faulting instruction address: 0xc000000000004400
>>   cpu 0x2: Vector: 400 (Instruction Access) at [c00000000c7475e0]
>>       pc: c000000000004400: exc_virt_0x4400_instruction_access+0x0/0x80
>>       lr: c0000000001862d4: update_rq_clock+0x44/0x110
>>       sp: c00000000c747880
>>      msr: 8000000040001031
>>     current = 0xc00000000c60d380
>>     paca    = 0xc00000001ec9de80   irqmask: 0x03   irq_happened: 0x01
>>       pid   = 347, comm = kworker/2:1
>>   ...
>>   enter ? for help
>>   [c00000000c747880] c0000000001862d4 update_rq_clock+0x44/0x110 (unreliable)
>>   [c00000000c7478f0] c000000000198794 update_blocked_averages+0xb4/0x6d0
>>   [c00000000c7479f0] c000000000198e40 update_nohz_stats+0x90/0xd0
>>   [c00000000c747a20] c0000000001a13b4 _nohz_idle_balance+0x164/0x390
>>   [c00000000c747b10] c0000000001a1af8 newidle_balance+0x478/0x610
>>   [c00000000c747be0] c0000000001a1d48 pick_next_task_fair+0x58/0x480
>>   [c00000000c747c40] c000000000eaab5c __schedule+0x12c/0x950
>>   [c00000000c747cd0] c000000000eab3e8 schedule+0x68/0x120
>>   [c00000000c747d00] c00000000016b730 worker_thread+0x130/0x640
>>   [c00000000c747da0] c000000000174d50 kthread+0x1a0/0x1b0
>>   [c00000000c747e10] c00000000000e0f0 ret_from_kernel_thread+0x5c/0x6c
>>
>> This shows that CPU 2, which was idle, woke up and then appears to
>> randomly take an instruction fault on a completely valid area of
>> kernel text.
>>
>> The cause turns out to be the call to hash__mark_rodata_ro(), late in
>> boot. Due to the way we layout text and rodata, that function actually
>> changes the permissions for all of text and rodata to read-only plus
>> execute.
>>
>> To do the permission change we use a hypervisor call, H_PROTECT. On
>> phyp that appears to be implemented by briefly removing the mapping of
>> the kernel text, before putting it back with the updated permissions.
>> If any other CPU is executing during that window, it will see spurious
>> faults on the kernel text and/or data, leading to crashes.
>
> Jordan asked why we saw this on phyp but not under KVM? We had a look at
> book3s_hv_rm_mmu.c but the code is a bit too obtuse for me to reason
> about!
>
> Nick suggests that the KVM hypervisor is invalidating the HPTE, but
> because we run guests in VPM mode, the hypervisor would catch the page
> fault and not reflect it down to the guest. It looks like Linux-as-a-HV
> will take HPTE_V_HVLOCK, and then because it's running in VPM mode, the
> hypervisor will catch the fault and not pass it to the guest.

Yep.

> But if phyp runs with VPM mode off, the guest will see the fault
> before the hypervisor. (we think this is what's going on anyway.)

Yeah. I assumed phyp always ran with VPM=1, but apparently it can run
with it off or on, depending on various configuration settings.

So I'm fairly sure what we're hitting here is VPM=0, where the faults go
straight to the guest.

cheers

  reply	other threads:[~2021-03-19 11:57 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-11 13:51 [PATCH 1/6] powerpc/mm/64s: Add _PAGE_KERNEL_ROX Michael Ellerman
2021-02-11 13:51 ` [PATCH 2/6] powerpc/pseries: Add key to flags in pSeries_lpar_hpte_updateboltedpp() Michael Ellerman
2021-02-16  5:39   ` Daniel Axtens
2021-02-18 23:25     ` Michael Ellerman
2021-02-11 13:51 ` [PATCH 3/6] powerpc/64s: Use htab_convert_pte_flags() in hash__mark_rodata_ro() Michael Ellerman
2021-02-16  5:50   ` Daniel Axtens
2021-02-11 13:51 ` [PATCH 4/6] powerpc/mm/64s/hash: Factor out change_memory_range() Michael Ellerman
2021-02-19  2:08   ` Daniel Axtens
2021-03-16  6:30     ` Michael Ellerman
2021-02-11 13:51 ` [PATCH 5/6] powerpc/mm/64s/hash: Add real-mode change_memory_range() for hash LPAR Michael Ellerman
2021-02-11 23:16   ` Nicholas Piggin
2021-03-20 13:04     ` Michael Ellerman
2021-03-22  2:56       ` Nicholas Piggin
2021-02-12  0:36   ` Nicholas Piggin
2021-03-16  6:40     ` Michael Ellerman
2021-03-22  3:09       ` Nicholas Piggin
2021-03-22  9:07         ` Michael Ellerman
2021-02-19  2:43   ` Daniel Axtens
2021-03-19 11:56     ` Michael Ellerman [this message]
2021-02-11 13:51 ` [PATCH 6/6] powerpc/mm/64s: Allow STRICT_KERNEL_RWX again Michael Ellerman
2021-04-10 14:28 ` [PATCH 1/6] powerpc/mm/64s: Add _PAGE_KERNEL_ROX Michael Ellerman
2021-04-19  5:17   ` Michael Ellerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871rcb8igw.fsf@mpe.ellerman.id.au \
    --to=mpe@ellerman.id.au \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=dja@axtens.net \
    --cc=linuxppc-dev@lists.ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).