From: Linus Torvalds <torvalds@linux-foundation.org> To: Javier Martinez Canillas <javierm@redhat.com> Cc: "amd-gfx list" <amd-gfx@lists.freedesktop.org>, "Hans de Goede" <hdegoede@redhat.com>, dri-devel <dri-devel@lists.freedesktop.org>, "Thomas Zimmermann" <tzimmermann@suse.de>, "Alex Deucher" <alexander.deucher@amd.com>, "Christian König" <christian.koenig@amd.com> Subject: Re: Annoying AMDGPU boot-time warning due to simplefb / amdgpu resource clash Date: Mon, 27 Jun 2022 10:25:39 -0700 [thread overview] Message-ID: <CAHk-=wjxzafG-=J8oT30s7upn4RhBs6TX-uVFZ5rME+L5_DoJA@mail.gmail.com> (raw) In-Reply-To: <3920df43-37f5-618d-70ba-de34a886e8ab@redhat.com> On Mon, Jun 27, 2022 at 1:02 AM Javier Martinez Canillas <javierm@redhat.com> wrote: > > The flag was dropped because it was causing drivers that requested their > memory resource with pci_request_region() to fail with -EBUSY (e.g: the > vmwgfx driver): > > https://www.spinics.net/lists/dri-devel/msg329672.html See, *that* link would have been useful in the commit. Rather than the useless link it has. Anyway, removing the busy bit just made things worse. > > If simplefb is actually still using that frame buffer, it's a problem. > > If it isn't, then maybe that resource should have been released? > > It's supposed to be released once amdgpu asks for conflicting framebuffers > to be removed calling drm_aperture_remove_conflicting_pci_framebuffers(). That most definitely doesn't happen. This is on a running system: [torvalds@ryzen linux]$ cat /proc/iomem | grep BOOTFB 00000000-00000000 : BOOTFB so I suspect that the BUSY bit was never the problem - even for vmwgfx). The problem was that simplefb doesn't remove its resource. Guys, the *reason* for resource management is to catch people that trample over each other's resources. You literally basically disabled the code that checked for it by removing the BUSY flag, and just continued to have conflicting resources. That isn't a "fix", that is literally "we are ignoring and breaking the whole reason that the resource tree exists, but we'll still use it for no good reason". Yeah, yeah, most modern drivers ignore the IO resource tree, because they end up working on another resource level entirely: they work on not the IO resources, but on the "driver level" instead, and just attach to PCI devices. So these days, few enough drivers even care about the IO resource tree, and it's mostly used for (a) legacy devices (think ISA) and (b) the actual bus resource handling (so the PCI code itself uses it to sort out resource use and avoid conflicts, but PCI drivers themselves generally then don't care, because the bus has "taken care of it". So that's why the amdgpu driver itself doesn't care about resource allocations, and we only get a warning for that memory type case, not for any deeper resource case. And apparently the vmwgfx driver still uses that legacy "let's claim all PCI resources in the resource tree" instead of just claiming the device itself. Which is why it hit this whole BOOTFB resource thing even harder. But the real bug is that BOOTFB seems to claim this resource even after it is done with it and other drivers want to take over. Not the BUSY bit. Linus
WARNING: multiple messages have this Message-ID (diff)
From: Linus Torvalds <torvalds@linux-foundation.org> To: Javier Martinez Canillas <javierm@redhat.com> Cc: "amd-gfx list" <amd-gfx@lists.freedesktop.org>, "Hans de Goede" <hdegoede@redhat.com>, dri-devel <dri-devel@lists.freedesktop.org>, "Thomas Zimmermann" <tzimmermann@suse.de>, "Alex Deucher" <alexander.deucher@amd.com>, "Christian König" <christian.koenig@amd.com>, "Zack Rusin" <zackr@vmware.com> Subject: Re: Annoying AMDGPU boot-time warning due to simplefb / amdgpu resource clash Date: Mon, 27 Jun 2022 10:25:39 -0700 [thread overview] Message-ID: <CAHk-=wjxzafG-=J8oT30s7upn4RhBs6TX-uVFZ5rME+L5_DoJA@mail.gmail.com> (raw) In-Reply-To: <3920df43-37f5-618d-70ba-de34a886e8ab@redhat.com> On Mon, Jun 27, 2022 at 1:02 AM Javier Martinez Canillas <javierm@redhat.com> wrote: > > The flag was dropped because it was causing drivers that requested their > memory resource with pci_request_region() to fail with -EBUSY (e.g: the > vmwgfx driver): > > https://www.spinics.net/lists/dri-devel/msg329672.html See, *that* link would have been useful in the commit. Rather than the useless link it has. Anyway, removing the busy bit just made things worse. > > If simplefb is actually still using that frame buffer, it's a problem. > > If it isn't, then maybe that resource should have been released? > > It's supposed to be released once amdgpu asks for conflicting framebuffers > to be removed calling drm_aperture_remove_conflicting_pci_framebuffers(). That most definitely doesn't happen. This is on a running system: [torvalds@ryzen linux]$ cat /proc/iomem | grep BOOTFB 00000000-00000000 : BOOTFB so I suspect that the BUSY bit was never the problem - even for vmwgfx). The problem was that simplefb doesn't remove its resource. Guys, the *reason* for resource management is to catch people that trample over each other's resources. You literally basically disabled the code that checked for it by removing the BUSY flag, and just continued to have conflicting resources. That isn't a "fix", that is literally "we are ignoring and breaking the whole reason that the resource tree exists, but we'll still use it for no good reason". Yeah, yeah, most modern drivers ignore the IO resource tree, because they end up working on another resource level entirely: they work on not the IO resources, but on the "driver level" instead, and just attach to PCI devices. So these days, few enough drivers even care about the IO resource tree, and it's mostly used for (a) legacy devices (think ISA) and (b) the actual bus resource handling (so the PCI code itself uses it to sort out resource use and avoid conflicts, but PCI drivers themselves generally then don't care, because the bus has "taken care of it". So that's why the amdgpu driver itself doesn't care about resource allocations, and we only get a warning for that memory type case, not for any deeper resource case. And apparently the vmwgfx driver still uses that legacy "let's claim all PCI resources in the resource tree" instead of just claiming the device itself. Which is why it hit this whole BOOTFB resource thing even harder. But the real bug is that BOOTFB seems to claim this resource even after it is done with it and other drivers want to take over. Not the BUSY bit. Linus
next prev parent reply other threads:[~2022-06-27 17:26 UTC|newest] Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-06-26 18:54 Annoying AMDGPU boot-time warning due to simplefb / amdgpu resource clash Linus Torvalds 2022-06-27 8:02 ` Javier Martinez Canillas 2022-06-27 17:25 ` Linus Torvalds [this message] 2022-06-27 17:25 ` Linus Torvalds 2022-06-28 8:43 ` Thomas Zimmermann 2022-06-28 12:41 ` Jocelyn Falempe 2022-06-29 8:20 ` Javier Martinez Canillas 2022-06-27 8:56 ` Thomas Zimmermann
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='CAHk-=wjxzafG-=J8oT30s7upn4RhBs6TX-uVFZ5rME+L5_DoJA@mail.gmail.com' \ --to=torvalds@linux-foundation.org \ --cc=alexander.deucher@amd.com \ --cc=amd-gfx@lists.freedesktop.org \ --cc=christian.koenig@amd.com \ --cc=dri-devel@lists.freedesktop.org \ --cc=hdegoede@redhat.com \ --cc=javierm@redhat.com \ --cc=tzimmermann@suse.de \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.