linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Qian Cai <cai@lca.pw>
To: Christoph Hellwig <hch@lst.de>
Cc: Borislav Petkov <bp@suse.de>,
	"Peter Zijlstra (Intel)" <peterz@infradead.org>,
	x86 <x86@kernel.org>, LKML <linux-kernel@vger.kernel.org>,
	kasan-dev <kasan-dev@googlegroups.com>
Subject: Re: AMD boot woe due to "x86/mm: Cleanup pgprot_4k_2_large() and pgprot_large_2_4k()"
Date: Wed, 22 Apr 2020 14:35:26 -0400	[thread overview]
Message-ID: <10D18276-0485-4368-BFDE-4EC13E42AE22@lca.pw> (raw)
In-Reply-To: <20200422170116.GA28345@lst.de>



> On Apr 22, 2020, at 1:01 PM, Christoph Hellwig <hch@lst.de> wrote:
> 
> On Wed, Apr 22, 2020 at 11:55:54AM -0400, Qian Cai wrote:
>> Reverted the linux-next commit and its dependency,
>> 
>> a85573f7e741 ("x86/mm: Unexport __cachemode2pte_tbl”)
>> 9e294786c89a (“x86/mm: Cleanup pgprot_4k_2_large() and pgprot_large_2_4k()”)
>> 
>> fixed crashes or hard reset on AMD machines during boot that have been flagged by
>> KASAN in different forms indicating some sort of memory corruption with this config,
> 
> Interesting.  Your config seems to boot fine in my VM until the point
> where the lack of virtio-blk support stops it from mounting the root
> file system.
> 
> Looking at the patch I found one bug, although that should not affect
> your config (it should use the pgprotval_t type), and one difference
> that could affect code generation, although I prefer the new version
> (use of __pgprot vs a local variable + pgprot_val()).
> 
> Two patches attached, can you try them?
> <0001-x86-Use-pgprotval_t-in-protval_4k_2_large-and-pgprot.patch><0002-foo.patch>

Yes, but both patches do not help here. This time flagged by UBSAN,

static void dump_pagetable(unsigned long address)
{
        pgd_t *base = __va(read_cr3_pa());
        pgd_t *pgd = base + pgd_index(address); <—— shift-out-of-bounds here

[    4.452663][    T0] ACPI: LAPIC_NMI (acpi_id[0x73] high level lint[0x1])
[    4.459391][    T0] ACPI: LAPIC_NMI (acpi_id[0x74] high level lint[0x1])
[    4.466115][    T0] ACPI: LAPIC_NMI (acpi_id[0x75] high level lint[0x1])
[    4.472842][    T0] ACPI: LAPIC_NMI (acpi_id[0x76] high level lint[0x1])
[    4.479567][    T0] ACPI: LAPIC_NMI (acpi_id[0x77] high level lint[0x1])
[    4.486294][    T0] ACPI: LAPIC_NMI (acpi_id[0x78] high level lint[0x1])
[    4.493021][    T0] ACPI: LAPIC_NMI (acpi_id[0x79] high level lint[0x1])
[    4.499745][    T0] ACPI: LAPIC_NMI (acpi_id[0x7a] high level lint[0x1])
[    4.506471][    T0] ACPI: LAPIC_NMI (acpi_id[0x7b] high level liad access in kernel mode
[    4.901030][    T0] #PF: error_code(0x0000) - not-present page
[    4.906884][    T0] BUG: unable to handle page fault for address: ffffed11509c29da
[    4.914483][    T0] #PF: supervisor read access in kernel mode
[    4.920334][    T0] #PF: error_code(0x0000) - not-present page
[    4.926189][    T0] BUG: unable to handle page fault for address: ffffed11509c29da
[    4.933786][    T0] #PF: supervisor read access in kernel mode
[    4.939640][    T0] #PF: error_code(0x0000) - not-present page
[    4.945492][    T0] BUG: unable to handle page fault for address: ffffed11509c29da
[    4.953091][    T0] #PF: supervisor read access in kernel mode
[    4.958943][    T0] #PF: error_code(0x0000) - not-present page
[    4.964797][    T0] BUG: unable to handle page fault for address: ffffed11509c29da
[    4.972395][    T0] #PF: supervisor read access in kernel mode
[    4.978247][    T0] #PF: error_code(0x0000) - not-present page
[    4.984102][    T0] BUG: unable to handle page fault for address: ffffed11509c29da
[    4.9917age fault for address: ffffed11509c29da
[    5.481007][    T0] #PF: supervisor read access in kernel mode
[    5.486862][    T0] #PF: error_code(0x0000) - not-present page
[    5.492713][    T0] BUG: unable to handle page fault for address: ffffed11509c29da
[    5.500314][    T0] #PF: supervisor read access in kernel mode
[    5.506165][    T0] #PF: error_code(0x0000) - not-present page
[    5.512020][    T0] ================================================================================
[    5.521193][    T0] UBSAN: shift-out-of-bounds in arch/x86/mm/fault.c:450:22
[    5.528268][    T0] shift exponent 4294967295 is too large for 64-bit type 'long unsigned int'
[    5.536916][    T0] CPU: 0 PID: 0 Comm: swapper Tainted: G    B             5.7.0-rc2-next-20200422+ #10
[    5.546434][    T0] Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
[    5.555692][    T0] Call Trace:
[    5.558837][    T0] ================================================================================
[    5.568012][T0] BUG: unable to handle page fault for address: 0000000a2b84dda8
[    5.961699][    T0] #PF: supervisor read access in kernel mode
[    5.967550][    T0] #PF: error_code(0x0000) - not-present page
[    5.973405][    T0] BUG: unable to handle page fault for address: 0000000a2b84dda8
[    5.981005][    T0] #PF: supervisor read access in kernel mode
[    5.986856][    T0] #PF: error_code(0x0000) - not-present page
[    5.992708][    T0] BUG: unable to handle page fault for address: 0000000a2b84dda8
[    6.000308][    T0] #PF: supervisor read access in kernel mode
[    6.006159][    T0] #PF: error_code(0x0000) - not-present page
[    6.012013][    T0] BUG: unable to handle page fault for address: 0000000a2b84dda8
[    6.019612][    T0] #PF: supervisor read access in kernel mode


  reply	other threads:[~2020-04-22 18:35 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-22 15:55 AMD boot woe due to "x86/mm: Cleanup pgprot_4k_2_large() and pgprot_large_2_4k()" Qian Cai
2020-04-22 16:17 ` Borislav Petkov
2020-04-22 16:35   ` Qian Cai
2020-04-22 16:47     ` Borislav Petkov
2020-04-22 18:54       ` Qian Cai
2020-04-22 17:01 ` Christoph Hellwig
2020-04-22 18:35   ` Qian Cai [this message]
2020-04-22 21:32   ` Qian Cai
2020-04-22 21:47     ` Borislav Petkov
2020-04-22 21:57       ` Qian Cai
2020-04-22 22:05         ` Borislav Petkov
2020-04-23  6:08           ` Christoph Hellwig
2020-04-23 10:47             ` Qian Cai
2020-04-23 11:06               ` Boris Petkov
2020-04-23 11:21                 ` Qian Cai
2020-04-23 12:22                   ` Borislav Petkov
2020-04-23 12:26   ` [tip: x86/mm] x86/mm: Use pgprotval_t in protval_4k_2_large() and protval_large_2_4k() tip-bot2 for Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=10D18276-0485-4368-BFDE-4EC13E42AE22@lca.pw \
    --to=cai@lca.pw \
    --cc=bp@suse.de \
    --cc=hch@lst.de \
    --cc=kasan-dev@googlegroups.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).