linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* general protection fault with prefetch_freepointer
@ 2018-07-18 14:31 Paul Menzel
  2018-08-08 11:54 ` Vlastimil Babka
  0 siblings, 1 reply; 5+ messages in thread
From: Paul Menzel @ 2018-07-18 14:31 UTC (permalink / raw)
  To: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 4304 bytes --]

Dear Linux,


Loading the amdgpu module on Ryzen 3 2{2,4}00G (Raven) systems sometimes
causes a general protection fault [1]. At least on my system I am unable
to reliably reproduce the issue.

```
[   35.265941] kfd kfd: kgd2kfd_probe failed
[   35.537445] general protection fault: 0000 [#1] SMP NOPTI
[   35.543209] CPU: 0 PID: 367 Comm: systemd-udevd Not tainted 4.18.0-rc5+ #1
[   35.550371] Hardware name: MSI MS-7A37/B350M MORTAR (MS-7A37), BIOS 1.G1 05/17/2018
[   35.558562] RIP: 0010:prefetch_freepointer+0x10/0x20
[   35.563881] Code: 89 d3 e8 c3 fe 4a 00 85 c0 0f 85 31 75 00 00 48 83 c4 08 5b 5d 41 5c 41 5d c3 0f 1f 44 00 00 48 85 f6 74 13 8b 47 20 48 01 c6 <48> 33 36 48 33 b7 38 01 00 00 0f 18 0e c3 66 90 0f 1f 44 00 00 48 
[   35.584215] RSP: 0018:ffff9c3181f77560 EFLAGS: 00010202
[   35.589849] RAX: 0000000000000000 RBX: 4b2c8be0a60f6ab9 RCX: 0000000000000ccc
[   35.597492] RDX: 0000000000000ccb RSI: 4b2c8be0a60f6ab9 RDI: ffff8d5b1e406e80
[   35.605166] RBP: ffff8d5b1e406e80 R08: ffff8d5b1e824f00 R09: ffffffffc0a45423
[   35.612808] R10: fffff1b60fff64c0 R11: ffff9c3181f77520 R12: 00000000006080c0
[   35.620451] R13: 0000000000000230 R14: ffff8d5b1e406e80 R15: ffff8d5b0c304400
[   35.628116] FS:  00007fb194e4b8c0(0000) GS:ffff8d5b1e800000(0000) knlGS:0000000000000000
[   35.636771] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   35.642931] CR2: 000055c0661d60d0 CR3: 000000040b9ee000 CR4: 00000000003406f0
[   35.650566] Call Trace:
[   35.653198]  kmem_cache_alloc_trace+0xb5/0x1c0
[   35.658077]  ? dal_ddc_service_create+0x38/0x110 [amdgpu]
[   35.663962]  dal_ddc_service_create+0x38/0x110 [amdgpu]
```

Do you have more ideas, how to debug that? The progress in bug report
kind of stalled as people have run out of ideas.

I thought about a hardware/firmware problem, but then searching for the
error, I found the stack trace with Linux 4.17.2 from Fedora [2][3].
This looks a little similar, but can of course be something totally
different.

```
general protection fault: 0000 [#1] SMP PTI
Modules linked in: sctp netlink_diag nfsv3 nfnetlink_queue nfnetlink_log nfnetlink cfg80211 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache rfcomm fuse hidp xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_conntrack tun devlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack bridge stp llc bnep it87 hwmon_vid btrfs xor zstd_compress raid6_pq libcrc32c zstd_decompress xxhash intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm gpio_ich iTCO_wdt iTCO_vendor_support ppdev irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate intel_uncore intel_rapl_perf btusb btrtl btbcm btintel bluetooth snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel
 snd_hda_codec ecdh_generic rfkill snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer i2c_i801 lpc_ich mei_me mei shpchp snd soundcore parport_pc parport nfsd binfmt_misc nfs_acl lockd auth_rpcgss grace sunrpc i915 crc32c_intel i2c_algo_bit drm_kms_helper serio_raw drm r8169 sata_via mii video
CPU: 0 PID: 2690 Comm: nfsd Not tainted 4.17.2-200.fc28.x86_64 #1
Hardware name: Gigabyte Technology Co., Ltd. H81M-S2PV/H81M-S2PV, BIOS F8 06/19/2014
RIP: 0010:prefetch_freepointer+0x10/0x20
RSP: 0018:ffffa33fca4ebc58 EFLAGS: 00010202
RAX: 0000000000000000 RBX: 49f929d076783c7f RCX: 0000000000000f2a
RDX: 0000000000000f29 RSI: 49f929d076783c7f RDI: ffff8da88c9ffe00
RBP: ffff8da88c9ffe00 R08: ffff8da89ee2b100 R09: 0000000000000004
R10: 0000000000000000 R11: 000000000000002c R12: 00000000014080c0
R13: ffffffffc07e0ac1 R14: ffff8da4bc958ae1 R15: ffff8da88c9ffe00
FS:  0000000000000000(0000) GS:ffff8da89ee00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00002001a1783000 CR3: 000000012920a001 CR4: 00000000000606f0
Call Trace:
 kmem_cache_alloc+0xb4/0x1d0
 ? nfsd4_free_file_rcu+0x20/0x20 [nfsd]
```

Help on how to debug this is appreciated.


Kind regards,

Paul


[1]: https://bugs.freedesktop.org/show_bug.cgi?id=105684
[2]: https://bugzilla.redhat.com/show_bug.cgi?id=1600482
[3]: https://retrace.fedoraproject.org/faf/reports/2243945/


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5174 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: general protection fault with prefetch_freepointer
  2018-07-18 14:31 general protection fault with prefetch_freepointer Paul Menzel
@ 2018-08-08 11:54 ` Vlastimil Babka
  2018-08-08 13:16   ` Vlastimil Babka
  0 siblings, 1 reply; 5+ messages in thread
From: Vlastimil Babka @ 2018-08-08 11:54 UTC (permalink / raw)
  To: Paul Menzel, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, linux-mm, linux-kernel, Kees Cook,
	Alex Deucher

On 07/18/2018 04:31 PM, Paul Menzel wrote:
> Dear Linux,
> 
> 
> Loading the amdgpu module on Ryzen 3 2{2,4}00G (Raven) systems sometimes
> causes a general protection fault [1]. At least on my system I am unable
> to reliably reproduce the issue.
> 
> ```
> [   35.265941] kfd kfd: kgd2kfd_probe failed
> [   35.537445] general protection fault: 0000 [#1] SMP NOPTI
> [   35.543209] CPU: 0 PID: 367 Comm: systemd-udevd Not tainted 4.18.0-rc5+ #1
> [   35.550371] Hardware name: MSI MS-7A37/B350M MORTAR (MS-7A37), BIOS 1.G1 05/17/2018
> [   35.558562] RIP: 0010:prefetch_freepointer+0x10/0x20
> [   35.563881] Code: 89 d3 e8 c3 fe 4a 00 85 c0 0f 85 31 75 00 00 48 83 c4 08 5b 5d 41 5c 41 5d c3 0f 1f 44 00 00 48 85 f6 74 13 8b 47 20 48 01 c6 <48> 33 36 48 33 b7 38 01 00 00 0f 18 0e c3 66 90 0f 1f 44 00 00 48 
> [   35.584215] RSP: 0018:ffff9c3181f77560 EFLAGS: 00010202
> [   35.589849] RAX: 0000000000000000 RBX: 4b2c8be0a60f6ab9 RCX: 0000000000000ccc
> [   35.597492] RDX: 0000000000000ccb RSI: 4b2c8be0a60f6ab9 RDI: ffff8d5b1e406e80
> [   35.605166] RBP: ffff8d5b1e406e80 R08: ffff8d5b1e824f00 R09: ffffffffc0a45423
> [   35.612808] R10: fffff1b60fff64c0 R11: ffff9c3181f77520 R12: 00000000006080c0
> [   35.620451] R13: 0000000000000230 R14: ffff8d5b1e406e80 R15: ffff8d5b0c304400
> [   35.628116] FS:  00007fb194e4b8c0(0000) GS:ffff8d5b1e800000(0000) knlGS:0000000000000000
> [   35.636771] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   35.642931] CR2: 000055c0661d60d0 CR3: 000000040b9ee000 CR4: 00000000003406f0
> [   35.650566] Call Trace:
> [   35.653198]  kmem_cache_alloc_trace+0xb5/0x1c0
> [   35.658077]  ? dal_ddc_service_create+0x38/0x110 [amdgpu]
> [   35.663962]  dal_ddc_service_create+0x38/0x110 [amdgpu]
> ```

./scripts/decodecode:

All code
========
   0:   89 d3                   mov    %edx,%ebx
   2:   e8 c3 fe 4a 00          callq  0x4afeca
   7:   85 c0                   test   %eax,%eax
   9:   0f 85 31 75 00 00       jne    0x7540
   f:   48 83 c4 08             add    $0x8,%rsp
  13:   5b                      pop    %rbx
  14:   5d                      pop    %rbp
  15:   41 5c                   pop    %r12
  17:   41 5d                   pop    %r13
  19:   c3                      retq   
  1a:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
  1f:   48 85 f6                test   %rsi,%rsi
  22:   74 13                   je     0x37
  24:   8b 47 20                mov    0x20(%rdi),%eax
  27:   48 01 c6                add    %rax,%rsi
  2a:*  48 33 36                xor    (%rsi),%rsi <- HERE
  2d:   48 33 b7 38 01 00 00    xor    0x138(%rdi),%rsi
  34:   0f 18 0e                prefetcht0 (%rsi)
  37:   c3                      retq   
  38:   66 90                   xchg   %ax,%ax
  3a:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
  3f:   48                      rex.W

rsi contains 4b2c8be0a60f6ab9, so reading from that clearly bogus
address faults. Dunno though why CR2 is 000055c0661d60d0 as that
should contain the same value. The XORs suggests you have
CONFIG_SLAB_FREELIST_HARDENED enabled, I wonder does the bug
reproduce with that disabled?

mov    0x20(%rdi),%eax is reading s->offset, so RDI=ffff8d5b1e406e80
is kmem_cache pointer, RAX=0 means offset was 0. This means
RSI=4b2c8be0a60f6ab9 is "object" parameter of prefetch_freepointer(),
i.e. the "next_object" variable in slab_alloc_node(), obtained from
the previous object. But I don't know much about SLUB internals to be
honest.

> Do you have more ideas, how to debug that? The progress in bug report
> kind of stalled as people have run out of ideas.
> 
> I thought about a hardware/firmware problem, but then searching for the
> error, I found the stack trace with Linux 4.17.2 from Fedora [2][3].
> This looks a little similar, but can of course be something totally
> different.
 
Hmm I have looked at the splats in all the bugs you referenced and the
Code part always has the de-obfuscation XORs. Then in comment 36 of
[1] jian-hong says the problem disappeared, and in comment 40 posts
a config that has CONFIG_SLAB_FREELIST_HARDENED disabled. Earlier
posting of his config has it enabled and confirms the disassembly.
Very suspicious, huh.

> general protection fault: 0000 [#1] SMP PTI
> Modules linked in: sctp netlink_diag nfsv3 nfnetlink_queue nfnetlink_log nfnetlink cfg80211 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache rfcomm fuse hidp xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_conntrack tun devlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack bridge stp llc bnep it87 hwmon_vid btrfs xor zstd_compress raid6_pq libcrc32c zstd_decompress xxhash intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm gpio_ich iTCO_wdt iTCO_vendor_support ppdev irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate intel_uncore intel_rapl_perf btusb btrtl btbcm btintel bluetooth snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel
>  snd_hda_codec ecdh_generic rfkill snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer i2c_i801 lpc_ich mei_me mei shpchp snd soundcore parport_pc parport nfsd binfmt_misc nfs_acl lockd auth_rpcgss grace sunrpc i915 crc32c_intel i2c_algo_bit drm_kms_helper serio_raw drm r8169 sata_via mii video
> CPU: 0 PID: 2690 Comm: nfsd Not tainted 4.17.2-200.fc28.x86_64 #1
> Hardware name: Gigabyte Technology Co., Ltd. H81M-S2PV/H81M-S2PV, BIOS F8 06/19/2014
> RIP: 0010:prefetch_freepointer+0x10/0x20
> RSP: 0018:ffffa33fca4ebc58 EFLAGS: 00010202
> RAX: 0000000000000000 RBX: 49f929d076783c7f RCX: 0000000000000f2a
> RDX: 0000000000000f29 RSI: 49f929d076783c7f RDI: ffff8da88c9ffe00
> RBP: ffff8da88c9ffe00 R08: ffff8da89ee2b100 R09: 0000000000000004
> R10: 0000000000000000 R11: 000000000000002c R12: 00000000014080c0
> R13: ffffffffc07e0ac1 R14: ffff8da4bc958ae1 R15: ffff8da88c9ffe00
> FS:  0000000000000000(0000) GS:ffff8da89ee00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00002001a1783000 CR3: 000000012920a001 CR4: 00000000000606f0
> Call Trace:
>  kmem_cache_alloc+0xb4/0x1d0
>  ? nfsd4_free_file_rcu+0x20/0x20 [nfsd]
> ```
> 
> Help on how to debug this is appreciated.
> 
> 
> Kind regards,
> 
> Paul
> 
> 
> [1]: https://bugs.freedesktop.org/show_bug.cgi?id=105684
> [2]: https://bugzilla.redhat.com/show_bug.cgi?id=1600482
> [3]: https://retrace.fedoraproject.org/faf/reports/2243945/
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: general protection fault with prefetch_freepointer
  2018-08-08 11:54 ` Vlastimil Babka
@ 2018-08-08 13:16   ` Vlastimil Babka
  2018-08-09  8:52     ` [PATCH] mm, slub: restore the original intention of prefetch_freepointer() Vlastimil Babka
  0 siblings, 1 reply; 5+ messages in thread
From: Vlastimil Babka @ 2018-08-08 13:16 UTC (permalink / raw)
  To: Paul Menzel, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, linux-mm, linux-kernel, Kees Cook,
	Alex Deucher

On 08/08/2018 01:54 PM, Vlastimil Babka wrote:
> Hmm I have looked at the splats in all the bugs you referenced and the
> Code part always has the de-obfuscation XORs. Then in comment 36 of
> [1] jian-hong says the problem disappeared, and in comment 40 posts
> a config that has CONFIG_SLAB_FREELIST_HARDENED disabled. Earlier
> posting of his config has it enabled and confirms the disassembly.
> Very suspicious, huh.

So I'm looking at 2482ddec670f ("mm: add SLUB free list pointer
obfuscation") from Kees, and one suspicious thing is:

before, prefetch_freepointer() was just:

	prefetch(object + s->offset);

after, it is

	if (object)
		prefetch(freelist_dereference(s, object + s->offset));

Where freelist_dereference() is either a simple dereference of address,
when FREELIST_HARDENED is disabled, or adds those XORs when enabled.
However, this had changed the prefetch intention! Previously it just
prefetched the address (object + s->offset), now it *dereferences it*,
optionally changes the value read with those XORs, and then prefetches
the result.

This unintentionally adds a non-prefetching read from (object +
s->offset), which may fault, and wasn't there before. It's safe from
NULL pointers, but not from bogus pointers, and faults that
get_freepointer_safe() prevents. Note that alone doesn't explain why
disabling SLAB_FREELIST_HARDENED would help, as the dereference is there
unconditionally. But IMHO it's a bug in the commit and minimally it
likely has some performance impact.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] mm, slub: restore the original intention of prefetch_freepointer()
  2018-08-08 13:16   ` Vlastimil Babka
@ 2018-08-09  8:52     ` Vlastimil Babka
  2018-08-14  0:03       ` Kees Cook
  0 siblings, 1 reply; 5+ messages in thread
From: Vlastimil Babka @ 2018-08-09  8:52 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Paul Menzel, Alex Deucher,
	Vlastimil Babka, Kees Cook, Daniel Micay, Eric Dumazet,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim

In SLUB, prefetch_freepointer() is used when allocating an object from cache's
freelist, to make sure the next object in the list is cache-hot, since it's
probable it will be allocated soon.

Commit 2482ddec670f ("mm: add SLUB free list pointer obfuscation") has
unintentionally changed the prefetch in a way where the prefetch is turned to a
real fetch, and only the next->next pointer is prefetched. In case there is not
a stream of allocations that would benefit from prefetching, the extra real
fetch might add a useless cache miss to the allocation. Restore the previous
behavior.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Kees Cook <keescook@chromium.org>
Cc: Daniel Micay <danielmicay@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
While I don't expect this to be causing the bug at hand, it's worth fixing.
For the bug it might mean that the page fault moves elsewhere.

 mm/slub.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 51258eff4178..ce2b9e5cea77 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -271,8 +271,7 @@ static inline void *get_freepointer(struct kmem_cache *s, void *object)
 
 static void prefetch_freepointer(const struct kmem_cache *s, void *object)
 {
-	if (object)
-		prefetch(freelist_dereference(s, object + s->offset));
+	prefetch(object + s->offset);
 }
 
 static inline void *get_freepointer_safe(struct kmem_cache *s, void *object)
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm, slub: restore the original intention of prefetch_freepointer()
  2018-08-09  8:52     ` [PATCH] mm, slub: restore the original intention of prefetch_freepointer() Vlastimil Babka
@ 2018-08-14  0:03       ` Kees Cook
  0 siblings, 0 replies; 5+ messages in thread
From: Kees Cook @ 2018-08-14  0:03 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, Linux-MM, LKML, Paul Menzel, Alex Deucher,
	Daniel Micay, Eric Dumazet, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim

On Thu, Aug 9, 2018 at 1:52 AM, Vlastimil Babka <vbabka@suse.cz> wrote:
> In SLUB, prefetch_freepointer() is used when allocating an object from cache's
> freelist, to make sure the next object in the list is cache-hot, since it's
> probable it will be allocated soon.
>
> Commit 2482ddec670f ("mm: add SLUB free list pointer obfuscation") has
> unintentionally changed the prefetch in a way where the prefetch is turned to a
> real fetch, and only the next->next pointer is prefetched. In case there is not
> a stream of allocations that would benefit from prefetching, the extra real
> fetch might add a useless cache miss to the allocation. Restore the previous
> behavior.
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Daniel Micay <danielmicay@gmail.com>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Christoph Lameter <cl@linux.com>
> Cc: Pekka Enberg <penberg@kernel.org>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> ---
> While I don't expect this to be causing the bug at hand, it's worth fixing.
> For the bug it might mean that the page fault moves elsewhere.
>
>  mm/slub.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index 51258eff4178..ce2b9e5cea77 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -271,8 +271,7 @@ static inline void *get_freepointer(struct kmem_cache *s, void *object)
>
>  static void prefetch_freepointer(const struct kmem_cache *s, void *object)
>  {
> -       if (object)
> -               prefetch(freelist_dereference(s, object + s->offset));
> +       prefetch(object + s->offset);

Ah -- gotcha. I think I misunderstood the purpose here. You're not
prefetching what is being pointed at, you're literally prefetching
what is stored there. That wouldn't require dereferencing the freelist
pointer, no.

Thanks!

Acked-by: Kees Cook <keescook@chromium.org>

>  }
>
>  static inline void *get_freepointer_safe(struct kmem_cache *s, void *object)
> --
> 2.18.0
>



-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-08-14  0:03 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-18 14:31 general protection fault with prefetch_freepointer Paul Menzel
2018-08-08 11:54 ` Vlastimil Babka
2018-08-08 13:16   ` Vlastimil Babka
2018-08-09  8:52     ` [PATCH] mm, slub: restore the original intention of prefetch_freepointer() Vlastimil Babka
2018-08-14  0:03       ` Kees Cook

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).