All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <christian.koenig@amd.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: David Airlie <airlied@linux.ie>,
	Alex Deucher <alexander.deucher@amd.com>,
	dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org,
	Maxim Levitsky <mlevitsk@redhat.com>
Subject: Re: Couple of issues with amdgpu on my WX4100
Date: Mon, 4 Jan 2021 18:39:33 +0100	[thread overview]
Message-ID: <ea539e21-aed3-8f23-74b2-5a214fa9fdb2@amd.com> (raw)
In-Reply-To: <20210104094547.06a61444@omen.home>

Am 04.01.21 um 17:45 schrieb Alex Williamson:
> On Mon, 4 Jan 2021 12:34:34 +0100
> Christian König <christian.koenig@amd.com> wrote:
>
>> Hi Maxim,
>>
>> I can't help with the display related stuff. Probably best approach to
>> get this fixes would be to open up a bug tracker for this on FDO.
>>
>> But I'm the one who implemented the resizeable BAR support and your
>> analysis of the problem sounds about correct to me.
>>
>> The reason why this works on Linux is most likely because we restore the
>> BAR size on resume (and maybe during initial boot as well).
>>
>> See this patch for reference:
>>
>> commit d3252ace0bc652a1a244455556b6a549f969bf99
>> Author: Christian König <ckoenig.leichtzumerken@gmail.com>
>> Date:   Fri Jun 29 19:54:55 2018 -0500
>>
>>       PCI: Restore resized BAR state on resume
>>
>>       Resize BARs after resume to the expected size again.
>>
>>       BugLink: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2Fshow_bug.cgi%3Fid%3D199959&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C942176d2e6aa4a4f3a4208d8b0d032bd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637453755549960615%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=3rsR%2Fx4uTpjtXFNqlJyFBteMmZMjWf3Neci7lUlkh88%3D&amp;reserved=0
>>       Fixes: d6895ad39f3b ("drm/amdgpu: resize VRAM BAR for CPU access v6")
>>       Fixes: 276b738deb5b ("PCI: Add resizable BAR infrastructure")
>>       Signed-off-by: Christian König <christian.koenig@amd.com>
>>       Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>>       CC: stable@vger.kernel.org      # v4.15+
>>
>>
>> It should be trivial to add this to the reset module as well. Most
>> likely even completely vendor independent since I'm not sure what a bus
>> reset will do to this configuration and restoring it all the time should
>> be the most defensive approach.
> Hmm, this should already be used by the bus/slot reset path:
>
> pci_bus_restore_locked()/pci_slot_restore_locked()
>   pci_dev_restore()
>    pci_restore_state()
>     pci_restore_rebar_state()
>
> VFIO support for resizeable BARs has been on my todo list, but I don't
> have access to any systems that have both a capable device and >4G
> decoding enabled in the BIOS.  If we have a consistent view of the BAR
> size after the BARs are expanded, I'm not sure why it doesn't just
> work.  FWIW, QEMU currently hides the REBAR capability to the guest
> because the kernel driver doesn't support emulation through config
> space (ie. it's read-only, which the spec doesn't support).

In this case the guest shouldn't be able to change the config at all and 
I have no idea what's going wrong here.

> AIUI, resource allocation can fail when enabling REBAR support, which
> is a problem if the failure occurs on the host but not the guest since
> we have no means via the hardware protocol to expose such a condition.
> Therefore the model I was considering for vfio-pci would be to simply
> pre-enable REBAR at the max size.

That's a rather bad idea. See our GPUs for example return way more than 
they actually need.

E.g. a Polaris usually returns 4GiB even when only 2GiB are installed, 
because 4GiB is just the maximum amount of RAM you can put together with 
the ASIC on a board.

Some devices even return a mask of all 1 even when they need only 2MiB, 
resulting in nearly 1TiB of wasted address space with this approach.

Regards,
Christian.

>    It might be sufficiently safe to
> test BAR expansion on initialization and then allow user control, but
> I'm concerned that resource availability could change while already in
> use by the user.  Thanks,
>
> Alex
>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

WARNING: multiple messages have this Message-ID (diff)
From: "Christian König" <christian.koenig@amd.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: David Airlie <airlied@linux.ie>,
	Alex Deucher <alexander.deucher@amd.com>,
	dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org,
	Maxim Levitsky <mlevitsk@redhat.com>
Subject: Re: Couple of issues with amdgpu on my WX4100
Date: Mon, 4 Jan 2021 18:39:33 +0100	[thread overview]
Message-ID: <ea539e21-aed3-8f23-74b2-5a214fa9fdb2@amd.com> (raw)
In-Reply-To: <20210104094547.06a61444@omen.home>

Am 04.01.21 um 17:45 schrieb Alex Williamson:
> On Mon, 4 Jan 2021 12:34:34 +0100
> Christian König <christian.koenig@amd.com> wrote:
>
>> Hi Maxim,
>>
>> I can't help with the display related stuff. Probably best approach to
>> get this fixes would be to open up a bug tracker for this on FDO.
>>
>> But I'm the one who implemented the resizeable BAR support and your
>> analysis of the problem sounds about correct to me.
>>
>> The reason why this works on Linux is most likely because we restore the
>> BAR size on resume (and maybe during initial boot as well).
>>
>> See this patch for reference:
>>
>> commit d3252ace0bc652a1a244455556b6a549f969bf99
>> Author: Christian König <ckoenig.leichtzumerken@gmail.com>
>> Date:   Fri Jun 29 19:54:55 2018 -0500
>>
>>       PCI: Restore resized BAR state on resume
>>
>>       Resize BARs after resume to the expected size again.
>>
>>       BugLink: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2Fshow_bug.cgi%3Fid%3D199959&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C942176d2e6aa4a4f3a4208d8b0d032bd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637453755549960615%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=3rsR%2Fx4uTpjtXFNqlJyFBteMmZMjWf3Neci7lUlkh88%3D&amp;reserved=0
>>       Fixes: d6895ad39f3b ("drm/amdgpu: resize VRAM BAR for CPU access v6")
>>       Fixes: 276b738deb5b ("PCI: Add resizable BAR infrastructure")
>>       Signed-off-by: Christian König <christian.koenig@amd.com>
>>       Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>>       CC: stable@vger.kernel.org      # v4.15+
>>
>>
>> It should be trivial to add this to the reset module as well. Most
>> likely even completely vendor independent since I'm not sure what a bus
>> reset will do to this configuration and restoring it all the time should
>> be the most defensive approach.
> Hmm, this should already be used by the bus/slot reset path:
>
> pci_bus_restore_locked()/pci_slot_restore_locked()
>   pci_dev_restore()
>    pci_restore_state()
>     pci_restore_rebar_state()
>
> VFIO support for resizeable BARs has been on my todo list, but I don't
> have access to any systems that have both a capable device and >4G
> decoding enabled in the BIOS.  If we have a consistent view of the BAR
> size after the BARs are expanded, I'm not sure why it doesn't just
> work.  FWIW, QEMU currently hides the REBAR capability to the guest
> because the kernel driver doesn't support emulation through config
> space (ie. it's read-only, which the spec doesn't support).

In this case the guest shouldn't be able to change the config at all and 
I have no idea what's going wrong here.

> AIUI, resource allocation can fail when enabling REBAR support, which
> is a problem if the failure occurs on the host but not the guest since
> we have no means via the hardware protocol to expose such a condition.
> Therefore the model I was considering for vfio-pci would be to simply
> pre-enable REBAR at the max size.

That's a rather bad idea. See our GPUs for example return way more than 
they actually need.

E.g. a Polaris usually returns 4GiB even when only 2GiB are installed, 
because 4GiB is just the maximum amount of RAM you can put together with 
the ASIC on a board.

Some devices even return a mask of all 1 even when they need only 2MiB, 
resulting in nearly 1TiB of wasted address space with this approach.

Regards,
Christian.

>    It might be sufficiently safe to
> test BAR expansion on initialization and then allow user control, but
> I'm concerned that resource availability could change while already in
> use by the user.  Thanks,
>
> Alex
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

  reply	other threads:[~2021-01-04 17:39 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-02 22:42 Couple of issues with amdgpu on my WX4100 Maxim Levitsky
2021-01-02 22:42 ` Maxim Levitsky
2021-01-04 11:34 ` Christian König
2021-01-04 11:34   ` Christian König
2021-01-04 16:45   ` Alex Williamson
2021-01-04 16:45     ` Alex Williamson
2021-01-04 17:39     ` Christian König [this message]
2021-01-04 17:39       ` Christian König
2021-01-04 18:43       ` Alex Williamson
2021-01-04 18:43         ` Alex Williamson
2021-01-04 20:13         ` Christian König
2021-01-04 20:13           ` Christian König
2021-01-04 21:45           ` Alex Williamson
2021-01-04 21:45             ` Alex Williamson
2021-01-06 20:21     ` Maxim Levitsky
2021-01-06 20:21       ` Maxim Levitsky
2021-01-15 11:29       ` Christian König
2021-01-15 11:29         ` Christian König
2021-01-06 21:27   ` Maxim Levitsky
2021-01-06 21:27     ` Maxim Levitsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ea539e21-aed3-8f23-74b2-5a214fa9fdb2@amd.com \
    --to=christian.koenig@amd.com \
    --cc=airlied@linux.ie \
    --cc=alex.williamson@redhat.com \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=mlevitsk@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.