amd-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: "Christian König" <ckoenig.leichtzumerken@gmail.com>
To: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Cc: Alex Deucher <alexdeucher@gmail.com>,
	michel@daenzer.net, Borislav Petkov <bp@alien8.de>,
	amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: grab extra fence reference for drm_sched_job_add_dependency
Date: Mon, 9 Jan 2023 14:40:45 +0100	[thread overview]
Message-ID: <82c8b18d-4e51-a137-6078-43b380661c37@gmail.com> (raw)
In-Reply-To: <CABXGCsMJxX3wo8yhQA=nOk0ouzh-WGp_65DJBYb_9v2m4kk7Mw@mail.gmail.com>

Am 09.01.23 um 14:13 schrieb Mikhail Gavrilov:
> On Fri, Jan 6, 2023 at 8:27 PM Christian König
> <ckoenig.leichtzumerken@gmail.com> wrote:
>>
>> And it looks like Dmitry submitted it initially to the wrong branch.
>>
>> Because of this it wasn't scheduled as fix for 6.2, but rather queued up
>> as new feature for 6.3.
>>
>> This is fixed by now and the patch should show up in the next -rc.
>>
>> Regards,
>> Christian.
>>
> Hi,
> Not sure related to this patch but I caught kernel oops this weekend.
> Reproducing is too hard. I don't know which actions need to be taken.
> but I'm definitely sure that this is happening when I launch
> "Cyberpunk 2077", Google Chrome with a huge amount of opened windows
> and tabs should be launched too.
> But even two described conditions is not enough.
> In a way that is not entirely clear to me, a memory leak should occur.

That looks like an out of memory situation is not gracefully handled.

In other words we have a missing NULL check in drm_sched_job_cleanup().

Going to take a look.

Thanks,
Christian.

>
> The trace looks like:
> BUG: kernel NULL pointer dereference, address: 0000000000000078
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 39818f067 P4D 39818f067 PUD 35bbd6067 PMD 4f8438067 PTE 0
> Oops: 0000 [#1] PREEMPT SMP NOPTI
> CPU: 21 PID: 100830 Comm: GameThread Tainted: G        W    L
> -------  ---  6.2.0-0.rc2.20230105git41c03ba9beea.20.fc38.x86_64 #1
> Hardware name: System manufacturer System Product Name/ROG STRIX
> X570-I GAMING, BIOS 4408 10/28/2022
> RIP: 0010:drm_sched_job_cleanup+0x1a/0x110 [gpu_sched]
> Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00
> 55 53 48 89 fb 48 83 ec 08 48 8b 7f 20 48 c7 04 24 00 00 00 00 <8b> 47
> 78 85 c0 0f 84 b5 00 00 00 48 83 ff c0 74 1f 48 8d 57 78 b8
> RSP: 0018:ffffae3e16c0b9d0 EFLAGS: 00010282
> RAX: 0000000000000001 RBX: ffff91de6f7bc000 RCX: 00000000012a8976
> RDX: 0000000000000000 RSI: ffffffffadbda69b RDI: 0000000000000000
> RBP: ffff91de6f7bc000 R08: 0000000000000001 R09: 0000000000000001
> R10: 0000000000000001 R11: 0000000000000000 R12: 00000000ffffffff
> R13: 0000000000000018 R14: ffff91e259275000 R15: 0000000000000001
> FS:  000000007bcff6c0(0000) GS:ffff91e667e00000(0000) knlGS:000000007abe0000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000078 CR3: 0000000297a24000 CR4: 0000000000350ee0
> Call Trace:
>   <TASK>
>   amdgpu_job_free+0x1d/0x120 [amdgpu]
>   amdgpu_cs_parser_fini+0x119/0x170 [amdgpu]
>   amdgpu_cs_ioctl+0x3f4/0x2000 [amdgpu]
>   ? __pfx_amdgpu_cs_ioctl+0x10/0x10 [amdgpu]
>   drm_ioctl_kernel+0xac/0x160
>   drm_ioctl+0x1e7/0x450
>   ? __pfx_amdgpu_cs_ioctl+0x10/0x10 [amdgpu]
>   amdgpu_drm_ioctl+0x4a/0x80 [amdgpu]
>   __x64_sys_ioctl+0x90/0xd0
>   do_syscall_64+0x5b/0x80
>   ? do_syscall_64+0x67/0x80
>   ? lock_is_held_type+0xe8/0x140
>   ? asm_sysvec_call_function+0x16/0x20
>   ? lockdep_hardirqs_on+0x7d/0x100
>   entry_SYSCALL_64_after_hwframe+0x72/0xdc
> RIP: 0033:0x7fe30905e65f
> Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48
> 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2
> 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00
> RSP: 002b:000000007bcfd410 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> RAX: ffffffffffffffda RBX: 000000007bcfd738 RCX: 00007fe30905e65f
> RDX: 000000007bcfd520 RSI: 00000000c0186444 RDI: 00000000000000b6
> RBP: 000000007bcfd520 R08: 00007fe2800a6b80 R09: 000000007bcfd4b0
> R10: 000000007e22b350 R11: 0000000000000246 R12: 00000000c0186444
> R13: 00000000000000b6 R14: 000000000000000d R15: 00007fe2800a6ab0
>   </TASK>
> Modules linked in: uinput rfcomm snd_seq_dummy snd_hrtimer netconsole
> nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet
> nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4
> nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack
> nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink qrtr bnep
> sunrpc binfmt_misc mt76x2u mt76x2_common mt76x02_usb mt76_usb iwlmvm
> mt76x02_lib mt76 mac80211 btusb iwlwifi libarc4 btrtl btbcm btintel
> btmtk hid_logitech_hidpp xpad bluetooth cfg80211 ff_memless joydev
> intel_rapl_msr intel_rapl_common edac_mce_amd eeepc_wmi
> snd_hda_codec_realtek kvm_amd asus_wmi snd_hda_codec_generic
> snd_seq_midi snd_seq_midi_event ledtrig_audio vfat asus_ec_sensors kvm
> sparse_keymap platform_profile snd_hda_codec_hdmi fat snd_usb_audio
> snd_hda_intel snd_intel_dspcfg snd_usbmidi_lib snd_intel_sdw_acpi
> irqbypass snd_rawmidi snd_hda_codec rapl rfkill mc snd_hda_core
> wmi_bmof pcspkr i2c_piix4 k10temp snd_hwdep snd_seq snd_seq_device
> [19447.812785]  snd_pcm acpi_cpufreq hid_logitech_dj snd_timer snd
> soundcore zram amdgpu drm_ttm_helper ttm video crct10dif_pclmul
> iommu_v2 crc32_pclmul crc32c_intel drm_buddy polyval_clmulni gpu_sched
> polyval_generic igb drm_display_helper nvme ucsi_ccg typec_ucsi
> ghash_clmulni_intel ccp typec sha512_ssse3 nvme_core cec sp5100_tco
> dca nvme_common wmi ip6_tables ip_tables fuse
> CR2: 0000000000000078
> ---[ end trace 0000000000000000 ]---
> RIP: 0010:drm_sched_job_cleanup+0x1a/0x110 [gpu_sched]
> Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00
> 55 53 48 89 fb 48 83 ec 08 48 8b 7f 20 48 c7 04 24 00 00 00 00 <8b> 47
> 78 85 c0 0f 84 b5 00 00 00 48 83 ff c0 74 1f 48 8d 57 78 b8
> RSP: 0018:ffffae3e16c0b9d0 EFLAGS: 00010282
> RAX: 0000000000000001 RBX: ffff91de6f7bc000 RCX: 00000000012a8976
> RDX: 0000000000000000 RSI: ffffffffadbda69b RDI: 0000000000000000
> RBP: ffff91de6f7bc000 R08: 0000000000000001 R09: 0000000000000001
> R10: 0000000000000001 R11: 0000000000000000 R12: 00000000ffffffff
> R13: 0000000000000018 R14: ffff91e259275000 R15: 0000000000000001
> FS:  000000007bcff6c0(0000) GS:ffff91e667e00000(0000) knlGS:000000007abe0000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000078 CR3: 0000000297a24000 CR4: 0000000000350ee0
>
>


  reply	other threads:[~2023-01-09 13:40 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-19 10:47 [PATCH] drm/amdgpu: grab extra fence reference for drm_sched_job_add_dependency Christian König
2022-12-19 14:00 ` Borislav Petkov
2022-12-21 21:10   ` Alex Deucher
2023-01-03  8:34     ` Christian König
2023-01-03 14:26       ` Alex Deucher
2023-01-03 14:28         ` Michel Dänzer
2023-01-05  1:44         ` Mikhail Gavrilov
2023-01-05 10:03           ` Christian König
2023-01-06 12:59             ` Mikhail Gavrilov
2023-01-06 14:24               ` Alex Deucher
2023-01-06 15:27                 ` Christian König
2023-01-09 13:13                   ` Mikhail Gavrilov
2023-01-09 13:40                     ` Christian König [this message]
2023-01-10 18:21                       ` Mikhail Gavrilov
2023-01-12 12:05                         ` Christian König
2022-12-19 15:08 ` Luben Tuikov
2022-12-23 10:00 ` Michal Kubecek
2022-12-23 22:55 ` Mikhail Gavrilov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=82c8b18d-4e51-a137-6078-43b380661c37@gmail.com \
    --to=ckoenig.leichtzumerken@gmail.com \
    --cc=alexdeucher@gmail.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=bp@alien8.de \
    --cc=michel@daenzer.net \
    --cc=mikhail.v.gavrilov@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).