All of lore.kernel.org
 help / color / mirror / Atom feed
* PCI passthrough: possible bug in memory relocation
@ 2022-04-03 23:24 Mateusz
  2022-04-04 13:03 ` Jan Beulich
  0 siblings, 1 reply; 4+ messages in thread
From: Mateusz @ 2022-04-03 23:24 UTC (permalink / raw)
  To: xen-devel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Hello,
I'm working on resolving a bug in Qubes OS. The problem is that
when someone gives a VM too much RAM the GPU passthrough doesn't
work as intended. I think it's because in these cases RAM overlaps
with PCI addresses and memory relocation doesn't work in Xen.

Here are the memory BARs of the GPU I've tried to passthrough:
Memory at f3000000 (32-bit, non-prefetchable) [size=16M]
Memory at e0000000 (64-bit, prefetchable) [size=256M]
Memory at f0000000 (64-bit, prefetchable) [size=32M]

The interesting thing is that in xl dmesg hvmloader prints these
lines:
(d1) Relocating 0xffff pages from 0e0001000 to 187000000 for lowmem MMIO hole
(d1) Relocating 0x1 pages from 0e0000000 to 196fff000 for lowmem MMIO hole
so it looks like it tries to move these pages to a different address
to make space for PCI memory, but for some reason it doesn't work.
Changing TOLUD to 3.5G solves the problem.
I tested it on xen 4.14.3
Here's the issue on github regarding this problem:
https://github.com/QubesOS/qubes-issues/issues/4321

Is it a bug in Xen and fixing it would fix the problem or is there
something I'm missing?
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCgAdFiEE0RoD3+S7b5zXZ6lW6IGQZRJEAKsFAmJKKjwACgkQ6IGQZRJE
AKtoDQ/+MQmH72WfmkCzbvpfmzsMW4xWpsx9Q6yZg58eEVv/pT06CC1KD8KUnyN3
ChqBNUQ3W2CZtJG0pJ1eON2wlByFuwnnl67MX0D0oKQQDfEf/oVEUspcXvfdgzR9
vOSqkZTKmZRjKOujrbiyO0Ooyrx7ED8IJE9UcfGzi8OYsYCbRg6vZtMghYoep6c4
w2ho7XLFO0pRjzzO+ZMIWzWJq2yp4H7/8F7QXv57tcszoWCrm/THkM8AGE51ilus
N8S3mlVJM+h2hWpZus9BTxnSULNby80nK3MfWAo6XSTPc/uRo3+NegBhFOFBNkJq
YDC+sbLj9Nr4/4uOKl+lLvg0HBSr/b0qebwE1EAwlj3k9byfFjBHCrf63as7AtO0
AuCE9hFqVE+vM8HPghhvcwzuUywwAljD8kxtvKT73JP/+F7v7PfKr6zMatfGcIMP
LYX7X64dMDsi9ySbjRyWWQFPEYlDKYGFA+88qJQZISQXSykO7QqKxeGpMKL/VUhX
BTLNLa568UReIhOzR5jvzDSl6c2kSxZluKpIuQIRE1iqShnUQPbbykA5zI6NzZOk
L5k/RJuRBYoKh/BWbqZsnLOrkYOltaWmFnHb3PPWNo3bVsQVl0ifkcaEoko6KJ0r
nhbpoieuL7Yn/o3aVLPOI7PP4Qy9LqC47L7h1BsojA75JfeFak8=
=eUPS
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: PCI passthrough: possible bug in memory relocation
  2022-04-03 23:24 PCI passthrough: possible bug in memory relocation Mateusz
@ 2022-04-04 13:03 ` Jan Beulich
  2022-04-04 23:36   ` Mateusz
  0 siblings, 1 reply; 4+ messages in thread
From: Jan Beulich @ 2022-04-04 13:03 UTC (permalink / raw)
  To: Mateusz; +Cc: xen-devel

On 04.04.2022 01:24, Mateusz wrote:
> Hello,
> I'm working on resolving a bug in Qubes OS. The problem is that
> when someone gives a VM too much RAM the GPU passthrough doesn't
> work as intended. I think it's because in these cases RAM overlaps
> with PCI addresses and memory relocation doesn't work in Xen.
> 
> Here are the memory BARs of the GPU I've tried to passthrough:
> Memory at f3000000 (32-bit, non-prefetchable) [size=16M]
> Memory at e0000000 (64-bit, prefetchable) [size=256M]
> Memory at f0000000 (64-bit, prefetchable) [size=32M]
> 
> The interesting thing is that in xl dmesg hvmloader prints these
> lines:
> (d1) Relocating 0xffff pages from 0e0001000 to 187000000 for lowmem MMIO hole
> (d1) Relocating 0x1 pages from 0e0000000 to 196fff000 for lowmem MMIO hole
> so it looks like it tries to move these pages to a different address
> to make space for PCI memory, but for some reason it doesn't work.
> Changing TOLUD to 3.5G solves the problem.
> I tested it on xen 4.14.3
> Here's the issue on github regarding this problem:
> https://github.com/QubesOS/qubes-issues/issues/4321
> 
> Is it a bug in Xen and fixing it would fix the problem or is there
> something I'm missing?

I'm afraid answering this requires debugging the issue. Yet you don't
share any technical details (as to how things don't work, logs, and
alike), and the provided link also doesn't look to point to any such
information (and as an aside I consider it somewhat unfriendly to
point at such a bug as an information source, not just for reference).
I'm pretty sure this code in hvmloader did work at some point, but
since it may be used quite rarely I could see that it might have got
broken.

Jan



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: PCI passthrough: possible bug in memory relocation
  2022-04-04 13:03 ` Jan Beulich
@ 2022-04-04 23:36   ` Mateusz
  2022-04-07 14:58     ` Jason Andryuk
  0 siblings, 1 reply; 4+ messages in thread
From: Mateusz @ 2022-04-04 23:36 UTC (permalink / raw)
  To: jbeulich; +Cc: xen-devel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

> I'm afraid answering this requires debugging the issue. Yet you don't
> share any technical details (as to how things don't work, logs, and
> alike), and the provided link also doesn't look to point to any such
> information (and as an aside I consider it somewhat unfriendly to
> point at such a bug as an information source, not just for reference).
> I'm pretty sure this code in hvmloader did work at some point, but
> since it may be used quite rarely I could see that it might have got
> broken.

Thanks for responding!
I only wanted to ask to see if maybe it's a known issue, but I guess not.
I'll try to debug and fix it myself so that's why I haven't posted more
technical details yet.
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCgAdFiEE0RoD3+S7b5zXZ6lW6IGQZRJEAKsFAmJLfqYACgkQ6IGQZRJE
AKuowA/+OOcS4CyBmG/NEF4Brc+oAsjdXsz3vZvMgqhc5oNU7hYOLnOg5KKZ0xvz
p2+Tm5saS05A+LIiFXC302tiwoQ9Gq+9hXw92c5nI+FZWyMNvA71Y27XKiHymWpe
ksORbIdc1B4st4J3bEls4R1PyjSuYyaEMcFMHH7aPGydXFjgon/8BtqenwKz2vrM
ncW+/VtQkAj2BcwMJbSq/M+JUOm115Jvb3LQDhCi/XvoGrduW+HB+lavN9opnJnT
jlSeS6H96L70EWYnV49+i5OBgQrcFfDQcZqZ4+dU9lFjEvWYn2d2wNzwD6PlXl1g
kHX+PBM95UYJDEwlCXGBX9Dc68LRIMAfpaOyzZsEYoNHbExGUPVpzKx9a7SnZBJZ
1X94MKuxDrbIULvpP1QezNaBMojtagI30DSODbuBpmcyu6Bl+QlKL/OFeP41+Ic5
EOWoFqjklNbvSVMyG08elRvmaR63JAHncCqDHBnxQ7eThMFQGNnY2qzniKqas/UR
2H0XgU5UqzZCsr+4Yk4Ab9gS06t+UdCTJAvoX9SyS0kjSHNkpF8fKZI8AgRg/RWL
mtMwouwZRaKlVmezYJKfuHvLQpb19dZFcFtkBOFN1PftXxNMPIEWq0w17rRzwWp0
QL6mzytmQXK5lfC2eeUDiggyk5gp25Gnw1bQyl6eSW/nN16cP/Q=
=pAFu
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: PCI passthrough: possible bug in memory relocation
  2022-04-04 23:36   ` Mateusz
@ 2022-04-07 14:58     ` Jason Andryuk
  0 siblings, 0 replies; 4+ messages in thread
From: Jason Andryuk @ 2022-04-07 14:58 UTC (permalink / raw)
  To: Mateusz; +Cc: xen-devel

On Mon, Apr 4, 2022 at 7:37 PM Mateusz <mati7337@protonmail.ch> wrote:
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA512
>
> > I'm afraid answering this requires debugging the issue. Yet you don't
> > share any technical details (as to how things don't work, logs, and
> > alike), and the provided link also doesn't look to point to any such
> > information (and as an aside I consider it somewhat unfriendly to
> > point at such a bug as an information source, not just for reference).
> > I'm pretty sure this code in hvmloader did work at some point, but
> > since it may be used quite rarely I could see that it might have got
> > broken.
>
> Thanks for responding!
> I only wanted to ask to see if maybe it's a known issue, but I guess not.
> I'll try to debug and fix it myself so that's why I haven't posted more
> technical details yet.

OpenXT manually configures the xl.cfg mmio_hole setting by reading the
PCI BAR sizes.  The (haskell) code is here:
https://github.com/OpenXT/manager/commit/33fef12b242e3cc9b46a32d07c84bc593ee537c9
.  It runs in dom0 and can access the PCI sysfs entries.

Trying to do this in QEMU in the stubdom is tricky.  Vanilla Xen
hotplugs the PV PCI devices to the stubdom and then hotplugs those
into QEMU with QMP devce_add.  (Qubes changed (or has a change
pending) to cold plugging the PV PCI devices to the stubdom, but they
are still hotplugged into QEMU via QMP device_add calls.)  You won't
know which PCI devices are applicable, but I guess in the stubdom you
can just assume all of the PV ones should be passed through.

It would be better for QEMU to do this itself, but hotplugging PCI
devices means it doesn't have the needed information during startup.

Looking at https://github.com/QubesOS/qubes-vmm-xen-stubdom-linux/pull/44
You look at /proc/iomem - I guess that works with the cold plugging of
PV PCI devices into the stubdom.  You only grab the first PCI device
and return after that - not the sum of all PCI devices detected.

libxl knows the devices that will be assigned at domU boot time.  So
it could do the calculation and adjust the mmio_hole size itself.
That doesn't help hotplugging, but it will be better than the
situation today.  libxl already does some PCI sysfs stuff, so it might
be a good place to solve this.

Regards,
Jason


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-04-07 14:59 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-03 23:24 PCI passthrough: possible bug in memory relocation Mateusz
2022-04-04 13:03 ` Jan Beulich
2022-04-04 23:36   ` Mateusz
2022-04-07 14:58     ` Jason Andryuk

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.