* [GIT PULL] x86/mm changes for v5.9 @ 2020-08-03 19:03 Ingo Molnar 2020-08-04 0:50 ` pr-tracker-bot 2020-08-05 11:03 ` Jason A. Donenfeld 0 siblings, 2 replies; 18+ messages in thread From: Ingo Molnar @ 2020-08-03 19:03 UTC (permalink / raw) To: Linus Torvalds Cc: linux-kernel, Thomas Gleixner, Borislav Petkov, Peter Zijlstra, Andrew Morton Linus, Please pull the latest x86/mm git tree from: git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86-mm-2020-08-03 # HEAD: 2b32ab031e82a109e2c5b0d30ce563db0fe286b4 x86/mm/64: Make sync_global_pgds() static The biggest change is to not sync the vmalloc and ioremap ranges for x86-64 anymore. Thanks, Ingo ------------------> Joerg Roedel (3): x86/mm: Pre-allocate P4D/PUD pages for vmalloc area x86/mm/64: Do not sync vmalloc/ioremap mappings x86/mm/64: Make sync_global_pgds() static arch/x86/include/asm/pgtable_64.h | 2 -- arch/x86/include/asm/pgtable_64_types.h | 2 -- arch/x86/mm/init_64.c | 59 +++++++++++++++++++++++++++++---- 3 files changed, 53 insertions(+), 10 deletions(-) ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [GIT PULL] x86/mm changes for v5.9 2020-08-03 19:03 [GIT PULL] x86/mm changes for v5.9 Ingo Molnar @ 2020-08-04 0:50 ` pr-tracker-bot 2020-08-05 11:03 ` Jason A. Donenfeld 1 sibling, 0 replies; 18+ messages in thread From: pr-tracker-bot @ 2020-08-04 0:50 UTC (permalink / raw) To: Ingo Molnar Cc: Linus Torvalds, linux-kernel, Thomas Gleixner, Borislav Petkov, Peter Zijlstra, Andrew Morton The pull request you sent on Mon, 3 Aug 2020 21:03:54 +0200: > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86-mm-2020-08-03 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/e96ec8cf9ca12a8d6b3b896a2eccd4b92a1893ab Thank you! -- Deet-doot-dot, I am a bot. https://korg.wiki.kernel.org/userdoc/prtracker ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [GIT PULL] x86/mm changes for v5.9 2020-08-03 19:03 [GIT PULL] x86/mm changes for v5.9 Ingo Molnar 2020-08-04 0:50 ` pr-tracker-bot @ 2020-08-05 11:03 ` Jason A. Donenfeld 2020-08-05 17:05 ` Linus Torvalds 2020-08-06 22:56 ` Joerg Roedel 1 sibling, 2 replies; 18+ messages in thread From: Jason A. Donenfeld @ 2020-08-05 11:03 UTC (permalink / raw) To: Ingo Molnar Cc: Linus Torvalds, linux-kernel, Thomas Gleixner, Borislav Petkov, Peter Zijlstra, Andrew Morton, Joerg Roedel Hi Ingo, Joerg, On Mon, Aug 03, 2020 at 09:03:54PM +0200, Ingo Molnar wrote: > Linus, > > Please pull the latest x86/mm git tree from: > > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86-mm-2020-08-03 > > # HEAD: 2b32ab031e82a109e2c5b0d30ce563db0fe286b4 x86/mm/64: Make sync_global_pgds() static > > The biggest change is to not sync the vmalloc and ioremap ranges for x86-64 anymore. > > Thanks, > > Ingo > > ------------------> > Joerg Roedel (3): > x86/mm: Pre-allocate P4D/PUD pages for vmalloc area > x86/mm/64: Do not sync vmalloc/ioremap mappings > x86/mm/64: Make sync_global_pgds() static The commit 8bb9bf242d1f ("x86/mm/64: Do not sync vmalloc/ioremap mappings") causes the OOPS below, in Linus' tree and in linux-next, unearthed by my CI on <https://www.wireguard.com/build-status/>. Bisecting reveals 8bb9bf242d1f, and reverting this makes the OOPS go away. Jason BUG: unable to handle page fault for address: ffffe8ffffd00608 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP CPU: 2 PID: 22 Comm: kworker/2:0 Not tainted 5.8.0+ #154 RIP: 0010:process_one_work+0x2c/0x2d0 Code: 41 56 41 55 41 54 55 48 89 f5 53 48 89 fb 48 83 ec 08 48 8b 06 4c 8b 67 40 49 89 c6 45 30 f6 a8 04 b8 00 00 00 00 4c 0f 44 f0 <49> 8b 46 08 44 8b a8 00 01 05 RSP: 0018:ffff88800016be98 EFLAGS: 00010002 RAX: 0000000000000000 RBX: ffff888000010c00 RCX: 0000000000000000 RDX: 00000000fffedb85 RSI: ffff888000391900 RDI: ffff888000010c00 RBP: ffff888000391900 R08: ffff888000391900 R09: ffff888000350090 R10: ffffffffff770df6 R11: ffff88800f91f840 R12: ffff88800f91f140 R13: ffff88800f91f160 R14: ffffe8ffffd00600 R15: ffff88800f91f140 FS: 0000000000000000(0000) GS:ffff88800f900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffe8ffffd00608 CR3: 00000000001be005 CR4: 00000000001706a0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: worker_thread+0x4b/0x3b0 ? rescuer_thread+0x360/0x360 kthread+0x116/0x140 ? __kthread_create_worker+0x110/0x110 ret_from_fork+0x1f/0x30 CR2: ffffe8ffffd00608 ---[ end trace ca0cfc5a79414fb2 ]--- RIP: 0010:process_one_work+0x2c/0x2d0 Code: 41 56 41 55 41 54 55 48 89 f5 53 48 89 fb 48 83 ec 08 48 8b 06 4c 8b 67 40 49 89 c6 45 30 f6 a8 04 b8 00 00 00 00 4c 0f 44 f0 <49> 8b 46 08 44 8b a8 00 01 05 RSP: 0018:ffff88800016be98 EFLAGS: 00010002 RAX: 0000000000000000 RBX: ffff888000010c00 RCX: 0000000000000000 RDX: 00000000fffedb85 RSI: ffff888000391900 RDI: ffff888000010c00 RBP: ffff888000391900 R08: ffff888000391900 R09: ffff888000350090 R10: ffffffffff770df6 R11: ffff88800f91f840 R12: ffff88800f91f140 R13: ffff88800f91f160 R14: ffffe8ffffd00600 R15: ffff88800f91f140 FS: 0000000000000000(0000) GS:ffff88800f900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffe8ffffd00608 CR3: 00000000001be005 CR4: 00000000001706a0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Kernel panic - not syncing: Fatal exception ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [GIT PULL] x86/mm changes for v5.9 2020-08-05 11:03 ` Jason A. Donenfeld @ 2020-08-05 17:05 ` Linus Torvalds 2020-08-06 13:10 ` Ingo Molnar 2020-08-06 22:56 ` Joerg Roedel 1 sibling, 1 reply; 18+ messages in thread From: Linus Torvalds @ 2020-08-05 17:05 UTC (permalink / raw) To: Jason A. Donenfeld Cc: Ingo Molnar, Linux Kernel Mailing List, Thomas Gleixner, Borislav Petkov, Peter Zijlstra, Andrew Morton, Joerg Roedel On Wed, Aug 5, 2020 at 4:03 AM Jason A. Donenfeld <Jason@zx2c4.com> wrote: > > The commit 8bb9bf242d1f ("x86/mm/64: Do not sync vmalloc/ioremap > mappings") causes the OOPS below, in Linus' tree and in linux-next, > unearthed by my CI on <https://www.wireguard.com/build-status/>. > Bisecting reveals 8bb9bf242d1f, and reverting this makes the OOPS go > away. The oops happens early in the function, and the "Code:" line actually gets almost the whole function prologue in it (missing first two bytes are probably "push %rbp"): 0: 41 56 push %r14 2: 41 55 push %r13 4: 41 54 push %r12 6: 55 push %rbp 7: 48 89 f5 mov %rsi,%rbp a: 53 push %rbx b: 48 89 fb mov %rdi,%rbx e: 48 83 ec 08 sub $0x8,%rsp 12: 48 8b 06 mov (%rsi),%rax 15: 4c 8b 67 40 mov 0x40(%rdi),%r12 19: 49 89 c6 mov %rax,%r14 1c: 45 30 f6 xor %r14b,%r14b 1f: a8 04 test $0x4,%al 21: b8 00 00 00 00 mov $0x0,%eax 26: 4c 0f 44 f0 cmove %rax,%r14 2a:* 49 8b 46 08 mov 0x8(%r14),%rax <-- trapping instruction > BUG: unable to handle page fault for address: ffffe8ffffd00608 > #PF: supervisor read access in kernel mode > #PF: error_code(0x0000) - not-present page > PGD 0 P4D 0 Yeah, missing page table because it wasn't copied. Presumably because that kthread is using the active_mm of some random user space process that didn't get sync'ed. And the sync_global_pgds() may have ended up being sufficient synchronization with whoever allocated thigns, even if it wasn't about the TLB contents themselves. So apparently the "the page-table pages are all pre-allocated now" is simply not true. Joerg? Unless somebody can figure this out fairly quickly, I think it should just be reverted. Linus ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [GIT PULL] x86/mm changes for v5.9 2020-08-05 17:05 ` Linus Torvalds @ 2020-08-06 13:10 ` Ingo Molnar 2020-08-06 18:57 ` Joerg Roedel 0 siblings, 1 reply; 18+ messages in thread From: Ingo Molnar @ 2020-08-06 13:10 UTC (permalink / raw) To: Linus Torvalds Cc: Jason A. Donenfeld, Linux Kernel Mailing List, Thomas Gleixner, Borislav Petkov, Peter Zijlstra, Andrew Morton, Joerg Roedel * Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Wed, Aug 5, 2020 at 4:03 AM Jason A. Donenfeld <Jason@zx2c4.com> wrote: > > > > The commit 8bb9bf242d1f ("x86/mm/64: Do not sync vmalloc/ioremap > > mappings") causes the OOPS below, in Linus' tree and in linux-next, > > unearthed by my CI on <https://www.wireguard.com/build-status/>. > > Bisecting reveals 8bb9bf242d1f, and reverting this makes the OOPS go > > away. > > The oops happens early in the function, and the "Code:" line actually > gets almost the whole function prologue in it (missing first two bytes > are probably "push %rbp"): > > 0: 41 56 push %r14 > 2: 41 55 push %r13 > 4: 41 54 push %r12 > 6: 55 push %rbp > 7: 48 89 f5 mov %rsi,%rbp > a: 53 push %rbx > b: 48 89 fb mov %rdi,%rbx > e: 48 83 ec 08 sub $0x8,%rsp > 12: 48 8b 06 mov (%rsi),%rax > 15: 4c 8b 67 40 mov 0x40(%rdi),%r12 > 19: 49 89 c6 mov %rax,%r14 > 1c: 45 30 f6 xor %r14b,%r14b > 1f: a8 04 test $0x4,%al > 21: b8 00 00 00 00 mov $0x0,%eax > 26: 4c 0f 44 f0 cmove %rax,%r14 > 2a:* 49 8b 46 08 mov 0x8(%r14),%rax <-- trapping instruction > > > > BUG: unable to handle page fault for address: ffffe8ffffd00608 > > #PF: supervisor read access in kernel mode > > #PF: error_code(0x0000) - not-present page > > PGD 0 P4D 0 > > Yeah, missing page table because it wasn't copied. > > Presumably because that kthread is using the active_mm of some random > user space process that didn't get sync'ed. > > And the sync_global_pgds() may have ended up being sufficient > synchronization with whoever allocated thigns, even if it wasn't about > the TLB contents themselves. > > So apparently the "the page-table pages are all pre-allocated now" is > simply not true. Joerg? > > Unless somebody can figure this out fairly quickly, I think it should > just be reverted. Agreed. Joerg? Thanks, Ingo ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [GIT PULL] x86/mm changes for v5.9 2020-08-06 13:10 ` Ingo Molnar @ 2020-08-06 18:57 ` Joerg Roedel 2020-08-06 19:02 ` Linus Torvalds 0 siblings, 1 reply; 18+ messages in thread From: Joerg Roedel @ 2020-08-06 18:57 UTC (permalink / raw) To: Ingo Molnar Cc: Linus Torvalds, Jason A. Donenfeld, Linux Kernel Mailing List, Thomas Gleixner, Borislav Petkov, Peter Zijlstra, Andrew Morton On Thu, Aug 06, 2020 at 03:10:34PM +0200, Ingo Molnar wrote: > > * Linus Torvalds <torvalds@linux-foundation.org> wrote: > > So apparently the "the page-table pages are all pre-allocated now" is > > simply not true. Joerg? It pre-allocates the whole vmalloc/ioremap PUD/P4D pages, but I actually only tested it with 4-level paging, as I don't have access to 5-level paging hardware. I'll try to make sense of that oops. > > Unless somebody can figure this out fairly quickly, I think it should > > just be reverted. > > Agreed. Joerg? Yeah, please just revert that commit for now until I have an idea what went wrong. Joerg ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [GIT PULL] x86/mm changes for v5.9 2020-08-06 18:57 ` Joerg Roedel @ 2020-08-06 19:02 ` Linus Torvalds 2020-08-06 19:23 ` Joerg Roedel 2020-08-06 21:20 ` Ingo Molnar 0 siblings, 2 replies; 18+ messages in thread From: Linus Torvalds @ 2020-08-06 19:02 UTC (permalink / raw) To: Joerg Roedel Cc: Ingo Molnar, Jason A. Donenfeld, Linux Kernel Mailing List, Thomas Gleixner, Borislav Petkov, Peter Zijlstra, Andrew Morton On Thu, Aug 6, 2020 at 11:57 AM Joerg Roedel <jroedel@suse.de> wrote: > > On Thu, Aug 06, 2020 at 03:10:34PM +0200, Ingo Molnar wrote: > > > > * Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > So apparently the "the page-table pages are all pre-allocated now" is > > > simply not true. Joerg? > > It pre-allocates the whole vmalloc/ioremap PUD/P4D pages, but I actually > only tested it with 4-level paging, as I don't have access to 5-level > paging hardware. I don't think Jason has either. The PGD 0 P4D 0 line tells us that "pgd_present()" is true, even though PGD is 0 (otherwise it wouldn't print the P4D part). That means that he doesn't have l5 enabled. But you may obviously have different settings for CONFIG_X86_5LEVEL, and maybe that ends up changing something? But since apparently it's not immediately obvious what the problem is, I'll revert it for now. Linus ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [GIT PULL] x86/mm changes for v5.9 2020-08-06 19:02 ` Linus Torvalds @ 2020-08-06 19:23 ` Joerg Roedel 2020-08-06 19:42 ` Linus Torvalds 2020-08-07 9:53 ` Jason A. Donenfeld 2020-08-06 21:20 ` Ingo Molnar 1 sibling, 2 replies; 18+ messages in thread From: Joerg Roedel @ 2020-08-06 19:23 UTC (permalink / raw) To: Linus Torvalds Cc: Ingo Molnar, Jason A. Donenfeld, Linux Kernel Mailing List, Thomas Gleixner, Borislav Petkov, Peter Zijlstra, Andrew Morton On Thu, Aug 06, 2020 at 12:02:40PM -0700, Linus Torvalds wrote: > But you may obviously have different settings for CONFIG_X86_5LEVEL, > and maybe that ends up changing something? > > But since apparently it's not immediately obvious what the problem is, > I'll revert it for now. Yes, that's the best for now. My gut feeling is that the fault Jason is seeing didn't happen on a vmalloc address, but I can't prove that yet. And if this is true it means that more work is needed before the syncing on x86-64 can be removed, so reverting is the best for now. Jason, can you share more details about the test setup which triggers this? Like the .config and the machine setup, ideally a qemu command-line, and how to reproduce it on that setup. Thanks, Joerg ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [GIT PULL] x86/mm changes for v5.9 2020-08-06 19:23 ` Joerg Roedel @ 2020-08-06 19:42 ` Linus Torvalds 2020-08-06 19:57 ` Tejun Heo 2020-08-07 9:53 ` Jason A. Donenfeld 1 sibling, 1 reply; 18+ messages in thread From: Linus Torvalds @ 2020-08-06 19:42 UTC (permalink / raw) To: Joerg Roedel Cc: Ingo Molnar, Jason A. Donenfeld, Linux Kernel Mailing List, Thomas Gleixner, Borislav Petkov, Peter Zijlstra, Andrew Morton, Tejun Heo On Thu, Aug 6, 2020 at 12:23 PM Joerg Roedel <jroedel@suse.de> wrote: > > Yes, that's the best for now. My gut feeling is that the fault Jason is > seeing didn't happen on a vmalloc address, but I can't prove that yet. No, it's definitely fairly high in the vmalloc space. Look at the faulting address: BUG: unable to handle page fault for address: ffffe8ffffd00608 and the code sequence is this: > 12: 48 8b 06 mov (%rsi),%rax > 15: 4c 8b 67 40 mov 0x40(%rdi),%r12 > 19: 49 89 c6 mov %rax,%r14 > 1c: 45 30 f6 xor %r14b,%r14b > 1f: a8 04 test $0x4,%al > 21: b8 00 00 00 00 mov $0x0,%eax > 26: 4c 0f 44 f0 cmove %rax,%r14 that admittedly odd sequence is get_work_pwq(work) And then the faulting instruction is: > 2a:* 49 8b 46 08 mov 0x8(%r14),%rax <-- trapping instruction and this is the "->wq" dereference. So it's the pwq->wq that traps, with 'pwq' being the trapping base pointer, and clearly being in the vmalloc space. I think pwq may a percpu allocation, so not _directly_ vmalloc(). Adding Tejun to the cc in case he can clarify ("No, silly Linus, it's allocated here.."). Linus ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [GIT PULL] x86/mm changes for v5.9 2020-08-06 19:42 ` Linus Torvalds @ 2020-08-06 19:57 ` Tejun Heo 0 siblings, 0 replies; 18+ messages in thread From: Tejun Heo @ 2020-08-06 19:57 UTC (permalink / raw) To: Linus Torvalds Cc: Joerg Roedel, Ingo Molnar, Jason A. Donenfeld, Linux Kernel Mailing List, Thomas Gleixner, Borislav Petkov, Peter Zijlstra, Andrew Morton On Thu, Aug 06, 2020 at 12:42:23PM -0700, Linus Torvalds wrote: > that admittedly odd sequence is get_work_pwq(work) > > And then the faulting instruction is: > > > 2a:* 49 8b 46 08 mov 0x8(%r14),%rax <-- trapping instruction > > and this is the "->wq" dereference. > > So it's the pwq->wq that traps, with 'pwq' being the trapping base > pointer, and clearly being in the vmalloc space. > > I think pwq may a percpu allocation, so not _directly_ vmalloc(). > Adding Tejun to the cc in case he can clarify ("No, silly Linus, it's > allocated here.."). Hey, silly Linus, yeap, they're per-cpu allocations and will be in vmalloc address space for per-cpu workqueues. For unbound workqueues, they're regular kmallocs. The per-cpu allocation happens in alloc_and_link_pwqs(): static int alloc_and_link_pwqs(struct workqueue_struct *wq) { bool highpri = wq->flags & WQ_HIGHPRI; int cpu, ret; if (!(wq->flags & WQ_UNBOUND)) { wq->cpu_pwqs = alloc_percpu(struct pool_workqueue); if (!wq->cpu_pwqs) return -ENOMEM; Thanks. -- tejun ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [GIT PULL] x86/mm changes for v5.9 2020-08-06 19:23 ` Joerg Roedel 2020-08-06 19:42 ` Linus Torvalds @ 2020-08-07 9:53 ` Jason A. Donenfeld 1 sibling, 0 replies; 18+ messages in thread From: Jason A. Donenfeld @ 2020-08-07 9:53 UTC (permalink / raw) To: Joerg Roedel Cc: Linus Torvalds, Ingo Molnar, Linux Kernel Mailing List, Thomas Gleixner, Borislav Petkov, Peter Zijlstra, Andrew Morton Hey Joerg, On Thu, Aug 6, 2020 at 9:23 PM Joerg Roedel <jroedel@suse.de> wrote: > Jason, can you share more details about the test setup which triggers > this? Like the .config and the machine setup, ideally a qemu > command-line, and how to reproduce it on that setup. make -C tools/testing/selftests/wireguard/qemu -j$(nproc) I can also confirm that the patch you sent to the list fixes the issue. Jason ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [GIT PULL] x86/mm changes for v5.9 2020-08-06 19:02 ` Linus Torvalds 2020-08-06 19:23 ` Joerg Roedel @ 2020-08-06 21:20 ` Ingo Molnar 2020-08-07 8:47 ` Joerg Roedel 1 sibling, 1 reply; 18+ messages in thread From: Ingo Molnar @ 2020-08-06 21:20 UTC (permalink / raw) To: Linus Torvalds Cc: Joerg Roedel, Jason A. Donenfeld, Linux Kernel Mailing List, Thomas Gleixner, Borislav Petkov, Peter Zijlstra, Andrew Morton * Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Thu, Aug 6, 2020 at 11:57 AM Joerg Roedel <jroedel@suse.de> wrote: > > > > On Thu, Aug 06, 2020 at 03:10:34PM +0200, Ingo Molnar wrote: > > > > > > * Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > > So apparently the "the page-table pages are all pre-allocated now" is > > > > simply not true. Joerg? > > > > It pre-allocates the whole vmalloc/ioremap PUD/P4D pages, but I actually > > only tested it with 4-level paging, as I don't have access to 5-level > > paging hardware. > > I don't think Jason has either. > > The > > PGD 0 P4D 0 > > line tells us that "pgd_present()" is true, even though PGD is 0 > (otherwise it wouldn't print the P4D part). That means that he doesn't > have l5 enabled. > > But you may obviously have different settings for CONFIG_X86_5LEVEL, > and maybe that ends up changing something? > > But since apparently it's not immediately obvious what the problem is, > I'll revert it for now. I've reverted it in x86/urgent as well earlier today, can send you that tree right now if you prefer that route. Thanks, Ingo ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [GIT PULL] x86/mm changes for v5.9 2020-08-06 21:20 ` Ingo Molnar @ 2020-08-07 8:47 ` Joerg Roedel 2020-08-13 19:30 ` Ingo Molnar 0 siblings, 1 reply; 18+ messages in thread From: Joerg Roedel @ 2020-08-07 8:47 UTC (permalink / raw) To: Ingo Molnar Cc: Linus Torvalds, Jason A. Donenfeld, Linux Kernel Mailing List, Thomas Gleixner, Borislav Petkov, Peter Zijlstra, Andrew Morton On Thu, Aug 06, 2020 at 11:20:19PM +0200, Ingo Molnar wrote: > I've reverted it in x86/urgent as well earlier today, can send you > that tree right now if you prefer that route. I sent a fix for preallocate_vmalloc_pages() to correctly pre-allocate the vmalloc PGD entries. I verified that it works and that swapper_pg_dir contains the correct entries now. This should also fix the issue Jason is seeing. Sorry for screwing this up :-( Regards, Joerg ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [GIT PULL] x86/mm changes for v5.9 2020-08-07 8:47 ` Joerg Roedel @ 2020-08-13 19:30 ` Ingo Molnar 2020-08-13 19:38 ` Linus Torvalds 2020-08-14 14:26 ` Joerg Roedel 0 siblings, 2 replies; 18+ messages in thread From: Ingo Molnar @ 2020-08-13 19:30 UTC (permalink / raw) To: Joerg Roedel Cc: Linus Torvalds, Jason A. Donenfeld, Linux Kernel Mailing List, Thomas Gleixner, Borislav Petkov, Peter Zijlstra, Andrew Morton * Joerg Roedel <jroedel@suse.de> wrote: > On Thu, Aug 06, 2020 at 11:20:19PM +0200, Ingo Molnar wrote: > > I've reverted it in x86/urgent as well earlier today, can send you > > that tree right now if you prefer that route. > > I sent a fix for preallocate_vmalloc_pages() to correctly pre-allocate > the vmalloc PGD entries. I verified that it works and that > swapper_pg_dir contains the correct entries now. This should also fix > the issue Jason is seeing. Thanks! There's one thing left to do. Linus has reverted the patch which exposed this bug: 7b4ea9456dd3: ("Revert "x86/mm/64: Do not sync vmalloc/ioremap mappings"") and has applied your fix: 995909a4e22b: ("x86/mm/64: Do not dereference non-present PGD entries") I think now we can re-apply the original commit: 8bb9bf242d1f: ("x86/mm/64: Do not sync vmalloc/ioremap mappings") Mind re-sending it, with an updated changelog that explains why it's now truly safe? Would be tentatively scheduled for v5.10 though, we've had enough excitement in this area for v5.9 I think. :-/ > Sorry for screwing this up :-( No problem, and it was my fault too: I sent 8bb9bf242d1f to Linus too quickly, just 7 days after applying it - x86/mm patches usually need a few weeks of testing. Thanks, Ingo ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [GIT PULL] x86/mm changes for v5.9 2020-08-13 19:30 ` Ingo Molnar @ 2020-08-13 19:38 ` Linus Torvalds 2020-08-14 14:26 ` Joerg Roedel 1 sibling, 0 replies; 18+ messages in thread From: Linus Torvalds @ 2020-08-13 19:38 UTC (permalink / raw) To: Ingo Molnar Cc: Joerg Roedel, Jason A. Donenfeld, Linux Kernel Mailing List, Thomas Gleixner, Borislav Petkov, Peter Zijlstra, Andrew Morton On Thu, Aug 13, 2020 at 12:30 PM Ingo Molnar <mingo@kernel.org> wrote: > > Would be tentatively scheduled for v5.10 though, we've had enough > excitement in this area for v5.9 I think. :-/ I think I can take it for 5.9 again if you send a pull request my way. If it causes any other problems, we'll obviously revert it again and at that point it's final for 5.9. But the one report it got was fixed early and quickly enough that I don't feel like things were _horribly_ broken, just a mistake that wasn't something horribly fundamental that needs another release to think about. Linus ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [GIT PULL] x86/mm changes for v5.9 2020-08-13 19:30 ` Ingo Molnar 2020-08-13 19:38 ` Linus Torvalds @ 2020-08-14 14:26 ` Joerg Roedel 1 sibling, 0 replies; 18+ messages in thread From: Joerg Roedel @ 2020-08-14 14:26 UTC (permalink / raw) To: Ingo Molnar Cc: Linus Torvalds, Jason A. Donenfeld, Linux Kernel Mailing List, Thomas Gleixner, Borislav Petkov, Peter Zijlstra, Andrew Morton Hi Ingo, On Thu, Aug 13, 2020 at 09:30:08PM +0200, Ingo Molnar wrote: > Mind re-sending it, with an updated changelog that explains why it's > now truly safe? I have updated the commit and will send it out later today, maybe together with a comment update in the preallocate_vmalloc_pages() function. Thanks, Joerg ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [GIT PULL] x86/mm changes for v5.9 2020-08-05 11:03 ` Jason A. Donenfeld 2020-08-05 17:05 ` Linus Torvalds @ 2020-08-06 22:56 ` Joerg Roedel 2020-08-06 23:12 ` Joerg Roedel 1 sibling, 1 reply; 18+ messages in thread From: Joerg Roedel @ 2020-08-06 22:56 UTC (permalink / raw) To: Jason A. Donenfeld Cc: Ingo Molnar, Linus Torvalds, linux-kernel, Thomas Gleixner, Borislav Petkov, Peter Zijlstra, Andrew Morton On Wed, Aug 05, 2020 at 01:03:48PM +0200, Jason A. Donenfeld wrote: > BUG: unable to handle page fault for address: ffffe8ffffd00608 Okay, looks like my usage of the page-table macros is bogus, seems I don't understand their usage as good as I thought. The p?d_none checks in the allocation path are wrong and led to the bug. In fact it caused only the first PUD entry to be allocated and in the later iterations of the loop it always ended up on the same PUD entry. I still don't fully understand why, but its late here and my head spins. So I take another look tomorrow in the hope to understand it better. Please remind me to not take vacation again during merge windows :) Anyway... Jason, does the attached diff fix the issue in your testing? For me it makes all PUD pages correctly allocated. diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c index c7a47603537f..e4abf73167d0 100644 --- a/arch/x86/mm/init.c +++ b/arch/x86/mm/init.c @@ -696,6 +696,7 @@ static void __init memory_map_bottom_up(unsigned long map_start, static void __init init_trampoline(void) { #ifdef CONFIG_X86_64 + if (!kaslr_memory_enabled()) trampoline_pgd_entry = init_top_pgt[pgd_index(__PAGE_OFFSET)]; else diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index e65b96f381a7..351fac590b02 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1248,27 +1248,23 @@ static void __init preallocate_vmalloc_pages(void) p4d_t *p4d; pud_t *pud; - p4d = p4d_offset(pgd, addr); - if (p4d_none(*p4d)) { - /* Can only happen with 5-level paging */ - p4d = p4d_alloc(&init_mm, pgd, addr); - if (!p4d) { - lvl = "p4d"; - goto failed; - } + p4d = p4d_alloc(&init_mm, pgd, addr); + if (!p4d) { + lvl = "p4d"; + goto failed; } if (pgtable_l5_enabled()) continue; - pud = pud_offset(p4d, addr); - if (pud_none(*pud)) { - /* Ends up here only with 4-level paging */ - pud = pud_alloc(&init_mm, p4d, addr); - if (!pud) { - lvl = "pud"; - goto failed; - } + /* + * With 4-level paging the P4D is folded, so allocate a + * PUD to have one level below PGD present. + */ + pud = pud_alloc(&init_mm, p4d, addr); + if (!pud) { + lvl = "pud"; + goto failed; } } ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [GIT PULL] x86/mm changes for v5.9 2020-08-06 22:56 ` Joerg Roedel @ 2020-08-06 23:12 ` Joerg Roedel 0 siblings, 0 replies; 18+ messages in thread From: Joerg Roedel @ 2020-08-06 23:12 UTC (permalink / raw) To: Jason A. Donenfeld Cc: Ingo Molnar, Linus Torvalds, linux-kernel, Thomas Gleixner, Borislav Petkov, Peter Zijlstra, Andrew Morton On Fri, Aug 07, 2020 at 12:56:51AM +0200, Joerg Roedel wrote: > I still don't fully understand why, but its late here and my head spins. > So I take another look tomorrow in the hope to understand it better. I think I know what the bug is, I am calling p4d_offset(pgd, addr) without checking for pgd_none() first. That is still a bug in the code wich needs fixing, even with the revert. Joerg ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2020-08-14 14:26 UTC | newest] Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-08-03 19:03 [GIT PULL] x86/mm changes for v5.9 Ingo Molnar 2020-08-04 0:50 ` pr-tracker-bot 2020-08-05 11:03 ` Jason A. Donenfeld 2020-08-05 17:05 ` Linus Torvalds 2020-08-06 13:10 ` Ingo Molnar 2020-08-06 18:57 ` Joerg Roedel 2020-08-06 19:02 ` Linus Torvalds 2020-08-06 19:23 ` Joerg Roedel 2020-08-06 19:42 ` Linus Torvalds 2020-08-06 19:57 ` Tejun Heo 2020-08-07 9:53 ` Jason A. Donenfeld 2020-08-06 21:20 ` Ingo Molnar 2020-08-07 8:47 ` Joerg Roedel 2020-08-13 19:30 ` Ingo Molnar 2020-08-13 19:38 ` Linus Torvalds 2020-08-14 14:26 ` Joerg Roedel 2020-08-06 22:56 ` Joerg Roedel 2020-08-06 23:12 ` Joerg Roedel
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).