xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Juergen Gross <jgross@suse.com>
To: "Marek Marczykowski-Górecki" <marmarek@invisiblethingslab.com>,
	xen-devel <xen-devel@lists.xenproject.org>
Subject: Re: kernel NULL pointer dereference in gntdev_mmap -> mmu_interval_notifier_remove
Date: Mon, 19 Apr 2021 11:33:27 +0200	[thread overview]
Message-ID: <68f4d2e3-4b25-58a6-d690-d5854c509354@suse.com> (raw)
In-Reply-To: <YHxFtj3dyjFbeusP@mail-itl>


[-- Attachment #1.1.1: Type: text/plain, Size: 6489 bytes --]

On 18.04.21 16:44, Marek Marczykowski-Górecki wrote:
> Hi,
> 
> I've recently got the crash like below. I'm not sure what exactly
> triggers it (besides grant table mapping as seen in the call trace), and
> also I don't have reliable reproducer. It happened once for about ~30
> startups.
> 
> Previous version tested was 5.10.25 and it didn't happened there, but
> since reproduction rate is not great, it could be just luck...
> 
> [ 1053.550389] BUG: kernel NULL pointer dereference, address: 00000000000003b0
> [ 1053.557844] #PF: supervisor read access in kernel mode
> [ 1053.557847] #PF: error_code(0x0000) - not-present page
> [ 1053.557851] PGD 0 P4D 0
> [ 1053.557858] Oops: 0000 [#1] SMP NOPTI
> [ 1053.557863] CPU: 1 PID: 8806 Comm: Xorg Tainted: G        W         5.10.28-1.fc32.qubes.x86_64 #1
> [ 1053.557865] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
> [ 1053.557876] RIP: e030:mmu_interval_notifier_remove+0x2e/0x190
> [ 1053.557879] Code: 00 41 55 41 54 55 48 89 fd 53 48 83 ec 30 4c 8b 67 38 65 48 8b 04 25 28 00 00 00 48 89 44 24 28 31 c0 48 c7 04 24 00 00 00 00 <49> 8b 9c 24 b0 03 00 00 48 c7 44 24 08 00 00 00 00 48 c7 44 24 10
> [ 1053.557881] RSP: e02b:ffffc90041617d18 EFLAGS: 00010246
> [ 1053.557883] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> [ 1053.557884] RDX: 0000000000000001 RSI: ffffffff81c3e9a0 RDI: ffff88812588b700
> [ 1053.557885] RBP: ffff88812588b700 R08: 7fffffffffffffff R09: 0000000000000000
> [ 1053.557886] R10: ffff8881088d4708 R11: ffff888108aa6180 R12: 0000000000000000
> [ 1053.557887] R13: 00000000fffffffc R14: ffff888106a3ec00 R15: ffff888106a3ec10
> [ 1053.557913] FS:  0000716f7f9a3a40(0000) GS:ffff888140300000(0000) knlGS:0000000000000000
> [ 1053.557915] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1053.557916] CR2: 00000000000003b0 CR3: 0000000105cf4000 CR4: 0000000000000660
> [ 1053.557919] Call Trace:
> [ 1053.557944]  gntdev_mmap+0x275/0x2f9 [xen_gntdev]
> [ 1053.557950]  mmap_region+0x47e/0x720
> [ 1053.557953]  do_mmap+0x438/0x540
> [ 1053.557959]  ? security_mmap_file+0x81/0xd0
> [ 1053.557963]  vm_mmap_pgoff+0xdf/0x130
> [ 1053.557967]  ksys_mmap_pgoff+0x1d6/0x240
> [ 1053.557973]  do_syscall_64+0x33/0x40
> [ 1053.557977]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ 1053.557981] RIP: 0033:0x716f7fe8c2e6
> [ 1053.557985] Code: 01 00 66 90 f3 0f 1e fa 41 f7 c1 ff 0f 00 00 75 2b 55 48 89 fd 53 89 cb 48 85 ff 74 37 41 89 da 48 89 ef b8 09 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 62 5b 5d c3 0f 1f 80 00 00 00 00 48 8b 05 79
> [ 1053.557986] RSP: 002b:00007ffcb4ef35c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000009
> [ 1053.557988] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 0000716f7fe8c2e6
> [ 1053.557989] RDX: 0000000000000001 RSI: 0000000000001000 RDI: 0000000000000000
> [ 1053.557990] RBP: 0000000000000000 R08: 0000000000000009 R09: 0000000000000000
> [ 1053.557991] R10: 0000000000000001 R11: 0000000000000246 R12: 00007ffcb4ef35e0
> [ 1053.557992] R13: 0000000000000001 R14: 0000000000000009 R15: 0000000000000001
> [ 1053.557995] Modules linked in: loop nf_tables nfnetlink vfat fat xfs snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg soundwire_intel soundwire_generic_allocation ppdev snd_soc_core snd_compress snd_pcm_dmaengine soundwire_cadence joydev snd_hda_codec snd_hda_core ac97_bus snd_hwdep snd_seq snd_seq_device snd_pcm edac_mce_amd snd_timer pcspkr snd soundcore e1000e i2c_piix4 parport_pc parport xenfs fuse ip_tables dm_crypt bochs_drm drm_vram_helper drm_kms_helper cec drm_ttm_helper ttm serio_raw drm virtio_scsi virtio_console ehci_pci ehci_hcd ata_generic pata_acpi floppy qemu_fw_cfg xen_privcmd xen_pciback xen_blkback xen_gntalloc xen_gntdev xen_evtchn uinput
> [ 1053.558040] CR2: 00000000000003b0
> [ 1053.558135] ---[ end trace 3c5c2ca63aac717a ]---
> [ 1054.277085] snd_hda_intel 0000:00:03.0: IRQ timing workaround is activated for card #0. Suggest a bigger bdl_pos_adj.
> [ 1054.927022] RIP: e030:mmu_interval_notifier_remove+0x2e/0x190
> [ 1054.929170] Code: 00 41 55 41 54 55 48 89 fd 53 48 83 ec 30 4c 8b 67 38 65 48 8b 04 25 28 00 00 00 48 89 44 24 28 31 c0 48 c7 04 24 00 00 00 00 <49> 8b 9c 24 b0 03 00 00 48 c7 44 24 08 00 00 00 00 48 c7 44 24 10
> [ 1054.937800] RSP: e02b:ffffc90041617d18 EFLAGS: 00010246
> [ 1054.947281] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> [ 1054.949535] RDX: 0000000000000001 RSI: ffffffff81c3e9a0 RDI: ffff88812588b700
> [ 1054.973016] RBP: ffff88812588b700 R08: 7fffffffffffffff R09: 0000000000000000
> [ 1054.976678] R10: ffff8881088d4708 R11: ffff888108aa6180 R12: 0000000000000000
> [ 1054.978850] R13: 00000000fffffffc R14: ffff888106a3ec00 R15: ffff888106a3ec10
> [ 1054.980751] FS:  0000716f7f9a3a40(0000) GS:ffff888140300000(0000) knlGS:0000000000000000
> [ 1054.982878] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1054.984509] CR2: 00000000000003b0 CR3: 0000000105cf4000 CR4: 0000000000000660
> [ 1054.990508] Kernel panic - not syncing: Fatal exception
> [ 1054.991967] Kernel Offset: disabled
> (XEN) Hardware Dom0 crashed: rebooting machine in 5 seconds.
> 
> Looking at the surrounding code, it is access to 0x3b0(%r12), which is
> 0x38(%rdi):
> 
> ffffffff812f5930 <mmu_interval_notifier_remove>:
> ffffffff812f5930:       e8 8b 09 d7 ff          callq  ffffffff810662c0 <__fentry__>
> ffffffff812f5935:       41 55                   push   %r13
> ffffffff812f5937:       41 54                   push   %r12
> ffffffff812f5939:       55                      push   %rbp
> ffffffff812f593a:       48 89 fd                mov    %rdi,%rbp
> ffffffff812f593d:       53                      push   %rbx
> ffffffff812f593e:       48 83 ec 30             sub    $0x30,%rsp
> ffffffff812f5942:       4c 8b 67 38             mov    0x38(%rdi),%r12
> ffffffff812f5946:       65 48 8b 04 25 28 00    mov    %gs:0x28,%rax
> ffffffff812f594d:       00 00
> ffffffff812f594f:       48 89 44 24 28          mov    %rax,0x28(%rsp)
> ffffffff812f5954:       31 c0                   xor    %eax,%eax
> ffffffff812f5956:       48 c7 04 24 00 00 00    movq   $0x0,(%rsp)
> ffffffff812f595d:       00
> ffffffff812f595e:       49 8b 9c 24 b0 03 00    mov    0x3b0(%r12),%rbx
> ffffffff812f5965:       00
> 
> If my calculation is right, it means map->notifier->mm is NULL.
> 

Could you try the attached patch?


Juergen

[-- Attachment #1.1.2: 0001-xen-gntdev-fix-gntdev_mmap-error-exit-path.patch --]
[-- Type: text/x-patch, Size: 1472 bytes --]

From 7ff3c32b36279aacef9cf80f4103fc6050759c10 Mon Sep 17 00:00:00 2001
From: Juergen Gross <jgross@suse.com>
Date: Mon, 19 Apr 2021 11:15:59 +0200
Subject: [PATCH] xen/gntdev: fix gntdev_mmap() error exit path
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Commit d3eeb1d77c5d0af ("xen/gntdev: use mmu_interval_notifier_insert")
introduced an error in gntdev_mmap(): in case the call of
mmu_interval_notifier_insert_locked() fails the exit path should not
call mmu_interval_notifier_remove().

One reason for failure is e.g. a signal pending for the running
process.

Fixes: d3eeb1d77c5d0af ("xen/gntdev: use mmu_interval_notifier_insert")
Cc: stable@vger.kernel.org
Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
---
 drivers/xen/gntdev.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index f01d58c7a042..a3e7be96527d 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -1017,8 +1017,10 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma)
 		err = mmu_interval_notifier_insert_locked(
 			&map->notifier, vma->vm_mm, vma->vm_start,
 			vma->vm_end - vma->vm_start, &gntdev_mmu_ops);
-		if (err)
+		if (err) {
+			map->vma = NULL;
 			goto out_unlock_put;
+		}
 	}
 	mutex_unlock(&priv->lock);
 
-- 
2.26.2


[-- Attachment #1.1.3: OpenPGP_0xB0DE9DD628BF132F.asc --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

  reply	other threads:[~2021-04-19  9:33 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-18 14:44 Marek Marczykowski-Górecki
2021-04-19  9:33 ` Juergen Gross [this message]
2021-04-23  3:22   ` Marek Marczykowski-Górecki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=68f4d2e3-4b25-58a6-d690-d5854c509354@suse.com \
    --to=jgross@suse.com \
    --cc=marmarek@invisiblethingslab.com \
    --cc=xen-devel@lists.xenproject.org \
    --subject='Re: kernel NULL pointer dereference in gntdev_mmap -> mmu_interval_notifier_remove' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).