All of lore.kernel.org
 help / color / mirror / Atom feed
* A question about PCI passthrough device BAR memory size
@ 2012-06-28 23:12 Rolu
  2012-06-29  7:51 ` Jan Beulich
  0 siblings, 1 reply; 2+ messages in thread
From: Rolu @ 2012-06-28 23:12 UTC (permalink / raw)
  To: xen-devel

I am passing through to a domU (among other things) two USB
controllers. Here is the lspci -v output on the dom0:

00:1d.0 USB controller: Intel Corporation Panther Point USB Enhanced
Host Controller #1 (rev 04) (prog-if 20 [EHCI])
	Subsystem: ASRock Incorporation Device 1e26
	Flags: bus master, medium devsel, latency 0, IRQ 23
	Memory at f7d17000 (32-bit, non-prefetchable) [size=1K]
	Capabilities: [50] Power Management version 2
	Capabilities: [58] Debug port: BAR=1 offset=00a0
	Capabilities: [98] PCI Advanced Features
	Kernel driver in use: pciback

And here is the same device's output in the domU:

00:07.0 USB controller: Intel Corporation Panther Point USB Enhanced
Host Controller #1 (rev 04) (prog-if 20 [EHCI])
	Subsystem: ASRock Incorporation Device 1e26
	Flags: bus master, medium devsel, latency 64, IRQ 44
	Memory at f3056000 (32-bit, non-prefetchable) [size=4K]
	Capabilities: [50] Power Management version 2
	Kernel driver in use: ehci_hcd

The output for the other controller is essentially the same.

The peculiar thing here is that the domU thinks it has a 4K memory
area while the dom0 says it's just 1K. The controllers work, and I
don't know enough about the PCI subsystems to say if this could cause
issues, but it seems things could go wrong if the domU ever decides to
use the other 3K of memory.

I had a look at how this value was calculated. I found that the guest
will write all ones to the BAR and then reads it, and the size of the
memory area is determined by how many bits come back as zero (per the
PCI specs). In qemu, in hw/pass-through.c, pt_bar_reg_write and
pt_bar_reg_read are responsible for emulating the writing and reading.
In pt_bar_reg_read, there is:

/* align resource size (memory type only) */
PT_GET_EMUL_SIZE(base->bar_flag, r_size);

For memory type BAR this macro changes r_size to:

(((r_size) + XC_PAGE_SIZE - 1) & ~(XC_PAGE_SIZE - 1));

This looks like it rounds r_size up to the next multiple of
XC_PAGE_SIZE, and logging confirms this is changing r_size from 0x400
to 0x1000. This ends up giving the guest the rounded up size, instead
of the real size.

So,
* is this an actual potential problem, or will something else ensure
that the guest isn't going to try to use the extra memory?
* if it needs fixing, how can it be done? I've looked through the code
but I'm not sure how to fix it without breaking other things.

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: A question about PCI passthrough device BAR memory size
  2012-06-28 23:12 A question about PCI passthrough device BAR memory size Rolu
@ 2012-06-29  7:51 ` Jan Beulich
  0 siblings, 0 replies; 2+ messages in thread
From: Jan Beulich @ 2012-06-29  7:51 UTC (permalink / raw)
  To: Rolu; +Cc: xen-devel

>>> On 29.06.12 at 01:12, Rolu <rolu@roce.org> wrote:
> I am passing through to a domU (among other things) two USB
> controllers. Here is the lspci -v output on the dom0:
> 
> 00:1d.0 USB controller: Intel Corporation Panther Point USB Enhanced
> Host Controller #1 (rev 04) (prog-if 20 [EHCI])
> 	Subsystem: ASRock Incorporation Device 1e26
> 	Flags: bus master, medium devsel, latency 0, IRQ 23
> 	Memory at f7d17000 (32-bit, non-prefetchable) [size=1K]
> 	Capabilities: [50] Power Management version 2
> 	Capabilities: [58] Debug port: BAR=1 offset=00a0
> 	Capabilities: [98] PCI Advanced Features
> 	Kernel driver in use: pciback
> 
> And here is the same device's output in the domU:
> 
> 00:07.0 USB controller: Intel Corporation Panther Point USB Enhanced
> Host Controller #1 (rev 04) (prog-if 20 [EHCI])
> 	Subsystem: ASRock Incorporation Device 1e26
> 	Flags: bus master, medium devsel, latency 64, IRQ 44
> 	Memory at f3056000 (32-bit, non-prefetchable) [size=4K]
> 	Capabilities: [50] Power Management version 2
> 	Kernel driver in use: ehci_hcd
> 
> The output for the other controller is essentially the same.
> 
> The peculiar thing here is that the domU thinks it has a 4K memory
> area while the dom0 says it's just 1K. The controllers work, and I
> don't know enough about the PCI subsystems to say if this could cause
> issues, but it seems things could go wrong if the domU ever decides to
> use the other 3K of memory.
> 
> I had a look at how this value was calculated. I found that the guest
> will write all ones to the BAR and then reads it, and the size of the
> memory area is determined by how many bits come back as zero (per the
> PCI specs). In qemu, in hw/pass-through.c, pt_bar_reg_write and
> pt_bar_reg_read are responsible for emulating the writing and reading.
> In pt_bar_reg_read, there is:
> 
> /* align resource size (memory type only) */
> PT_GET_EMUL_SIZE(base->bar_flag, r_size);
> 
> For memory type BAR this macro changes r_size to:
> 
> (((r_size) + XC_PAGE_SIZE - 1) & ~(XC_PAGE_SIZE - 1));
> 
> This looks like it rounds r_size up to the next multiple of
> XC_PAGE_SIZE, and logging confirms this is changing r_size from 0x400
> to 0x1000. This ends up giving the guest the rounded up size, instead
> of the real size.
> 
> So,
> * is this an actual potential problem, or will something else ensure
> that the guest isn't going to try to use the extra memory?

I think it is wrong for qemu-dm to not honor the original size. A
driver handling different device versions/implementations could
look at this and adapt its behavior accordingly (and would likely
fail then).

The second aspect to this - making sure the guest doesn't access
some other guest's (or the host's) MMIO space is something to be
taken care of in the host, actually. The host has to re-assign
(or assign in the first place, should the firmware not have done
so) resources such that no two devices to be passed through to
a guest share the same PAGE_SIZE region for their MMIO blocks.

In the non-pvops kernel we have special code and command line
options for this, but I believe this became redundant with other
code and options in the upstream kernels by now (just never
got around to go in and check how much redundancy there is
and could hence be eliminated).

In any case, these are things that - afaict - need manual admin
action to get right _before_ passing through any device to a
guest.

> * if it needs fixing, how can it be done? I've looked through the code
> but I'm not sure how to fix it without breaking other things.

Since qemu ought to be able to find out the real device's BAR
sizes, it shouldn't be that difficult to make it use that value in
the config space access emulation rather than the rounded
up one - in the worst case it would have to track two values
instead of one.

Jan

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2012-06-29  7:51 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-28 23:12 A question about PCI passthrough device BAR memory size Rolu
2012-06-29  7:51 ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.